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Open-world query answering is the problem of deciding, given a set of facts, conjunc¬ 
tion of constraints, and query, whether the facts and constraints imply the query. This 
amounts to reasoning over all instances that include the facts and satisfy the constraints. 
We study finite open-world query answering (FQA), which assumes that the underlying 
world is finite and thus only considers the finite completions of the instance. The major 
known decidable cases of FQA derive from the following: the guarded fragment of first- 
order logic, which can express referential constraints (data in one place points to data in 
another) but cannot express number restrictions such as functional dependencies; and the 
guarded fragment with number restrictions but on a signature of arity only two. In this pa¬ 
per, we give the first decidability results for FQA that combine both referential constraints 
and number restrictions for arbitrary signatures: we show that, for unary inclusion depen¬ 
dencies and functional dependencies, the finiteness assumption of FQA can be lifted up 
to taking the finite implication closure of the dependencies [8]. Our result relies on new 
techniques to construct finite universal models of such constraints, for any bound on the 
maximal query size. 


I. Introduction 

A longstanding goal in computational logic is to design logical languages that are both decidable and 
expressive. One approach is to distinguish integrity constraints and queries, and have separate lan¬ 
guages for them. We would then seek decidability of the query answering with constraints problem: 
given a query q, a conjunction of constraints £, and a finite instance /, determine which answers to q 
are certain to hold over any instance /' that extends / and satisfies £. This problem is often called open- 
world query answering. If is fundamenfal for deciding query confainmenf under consfrainfs, querying 
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in the presence of ontologies, or reformulating queries with constraints. Thus it has been the subject of 
intense study within several communities for decades (e.g. [11, 5, 3, 15, 10]). 

In many cases (e.g., in databases) the instances I' of interest are the finite ones, and hence we can 
define finite open-world query answering (denofed here as FQA), which resfricfs fhe quanfificafion 
fo finite exfensions /' of /. In confrasf, by unrestricted open-world query answering (UQA) we refer 
fo fhe problem where I' can be eifher finite or infinife. Generally fhe class of queries is faken fo be 
fhe conjuncfive queries (CQs) — queries builf up from relafional atoms via exisfenfial quanfificafion 
and conjuncfion. We will resfricf to CQs here, and fhus omif explicif menfion of fhe query language, 
focusing on fhe consfrainf language. 

A firsl consfrainf class known to have fracfable open-world query answering problems are inclusion 
dependencies (IDs) — consfrainfs of fhe form, e.g., \/xyz R{x,y,z) —> Bvw S{z,v,w,y). The fundamenfal 
resulfs of Johnson and King [11] and Rosafi [18] show fhaf bofh FQA and UQA are decidable for ID 
and fhaf, in facl, fhey coincide. When Ibis happens, fhe consfrainfs are said fo be finitely controllable. 
These resulfs have been generalized by Barany el al. [3] fo a much richer class of consfrainfs, fhe 
guarded fragmenl of firsl-order logic. 

However, fhose resulfs do nol cover a second imporlanl kind of consfrainfs, namely number restric¬ 
tions, which express, e.g., uniqueness. We represenf fhem by fhe class of functional dependencies 
(FDs) — of fhe form Vjtj {R{xi ,... ,x„) AR(yi,... ,y„) A = y,) —> = yr- The implicafion prob¬ 

lem (does one FD follow from a sef of ofhers) is decidable for FDs, and coincides wifh implicafion 
reslricled fo finite inslances [1]. Trivially, fhe FQA and UQA problems are also decidable for FDs 
alone, and coincide. 

Trying fo combine IDs and FDs makes bofh UQA and FQA undecidable in general [5]. However, 
UQA is known to be decidable when fhe FDs and fhe IDs are non-conflicting [11, 5]. Infuifively, fhis 
condifion guaranfees fhaf fhe FDs can be ignored, as long as fhey hold on fhe inifial insfance I, and 
one can fhen solve fhe query answering problem by considering fhe IDs alone. Buf fhe non-conflicfing 
condifion only applies fo UQA and nol fo FQA. In facl if is known fhaf even for very simple classes 
of IDs and FDs, including non-conflicfing classes, FQA and UQA do nol coincide. Rosafi [18] showed 
fhaf FQA is undecidable for non-conflicfing IDs and FDs (indeed, for IDs and keys, which are less rich 
lhan FDs). 

Thus a general queslion is to whaf exlenf Ihese classes, FDs and I Ds, can be combined while relaining 
decidable FQA. The only decidable cases impose very severe requiremenls. For example, fhe consfrainf 
class of “single KDs and FKs” inlroduced in [18] has decidable FQA, buf such consfrainfs cannol model, 
e.g., FDs which are nof keys. Furlher, in conlrasl wifh fhe general case of FDs and IDs, single KDs and 
FKs are always finilely confrollable, which limils Iheir expressiveness. Indeed, we know of no tools to 
deal wifh FQA for non-finilely-conlrollable consfrainfs on relalions of arbilrary arily. 

A second decidable case is where all relalion symbols and all subformulas of fhe consfrainfs have 
arify af mosf fwo. In Ibis confexf, resulfs of Praff-Harfmann [15] imply fhe decidabilify of bofh FQA 
and UQA for a very rich non-finilely-conlrollable sublogic of firsl-order logic. For some fragmenls of 
Ibis arily-lwo logic, fhe complexify of FQA has recenlly been isolated by Ibanez-Garcfa el al. [10]. Yel 
Ihese resulfs do nol apply fo arbilrary arily signalures. 

The contribution of this paper is to provide the first result about finite query answering for non- 
finitely-controllable ID 5 and FD 5 over relations of arbitrary arity. As fhe problem is undecidable in 
general, we musl nalurally make some reslriclion. Our choice is to limil to Unary ID 5 (UIDs), which 
exporl only one variable: for insfance, Mxyz R{x,y,z) —> Bw S{w,x). UIDs and FDs are an inleresling 
class fo sludy because fhey are nof finilely confrollable, and allow fhe modeling, e.g., of single-allribule 
foreign keys, a common use case in dalabase systems. The decidabilify of UQA for UIDs and FDs is 
known because fhey are always non-conflicling. In fhis paper, we show fhaf finile query answering is 
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decidable for UIDs and FDs, and obtain tight bounds on its complexity. 

The idea is to reduce the finite case to the unrestricted case, but in a more complex way than by 
finite controllability. We make use of a technique originating in Cosmadakis et al. [8] to study finite 
implication on UIDs and FDs: the finite closure operation which takes a conjunction of UIDs and FDs 
and determines exactly which additional UIDs and FDs are implied over finite instances. Rosati [17] 
and Ibanez-Garcfa [10] make use of the closure operation in their study of constraint classes over 
schemas of arity two. They show that finite query answering for a query q, instance I, and constraints 
r reduces to unrestricted query answering for I, q, and the finite closure Z' of Z. In other words, the 
closure construction which is sound for implication is also sound for query answering. 

We show that the same general approach applies to arbitrary arity signatures, with constraints being 
UIDs and FDs. Our main result thus reduces finite query answering to unrestricted query answering, 
for UIDs and FDs in arbitrary arity: 

Theorem 1.1, For any finite instance I, conjunctive query q, and constraints Z consisting of UID 5 
and FDi, the finite open-world query answering problem for I,q under Z has the same answer as the 
unrestricted open-world query answering problem for 1 ,q under the finite closure ofZ. 

Using the known results about the complexity of UQA for UIDs, we isolate the precise complexity 
of finite query answering with respect to UIDs and FDs, showing that it matches that of UQA: 

Corollary 1.2. The combined complexity of the finite open-world query answering problem for UID 5 
and FD^ is NF-complete, and it is PTIME in data complexity (that is, when the constraints and query 
are fixed). 

Our proof of Theorem 1.1 is quite involved, since dealing with arbitrary arity models introduces 
many new difficulties that do not arise in the arity-two case or in the case of IDs in isolation. We 
borrow and adapt a variety of techniques from prior work: using k-bounded simulations to preserve 
small acyclic CQs [10], dealing with UIDs following a topological sort [8, 10], performing a chase that 
reuses sufficiently similar elements [18], and taking the product with groups of large girth to blow up 
cycles [14]. However, we must also develop some new infrastructure to deal with number restrictions 
in an arbitrary arity setting: distinguishing between so-called dangerous and non-dangerous positions 
when chasing, constructing realizations for relations in a piecewise manner following the FDs, reusing 
elements in a combinatorial way that shuffles fhem fo avoid violating fhe higher-arify FDs, and a new 
notion of mixed product fo blow cycles up while preserving facl overlaps fo avoid violating fhe higher- 
arify FDs. 

Paper structure. The general scheme, presented in Section III, is to construct models of UIDs and 
FDs that are universal up to a certain query size k, which we call k-universal models. We start with only 
unary FDs (UFDs) and acyclic CQs (ACQs), and by assuming that the UIDs and UFDs are reversible, 
a condition inspired by the finite closure construction. 

As a warm-up. Section IV proves the weakened result for a much weaker notion than k-universality, 
starting with binary signatures and generalizing to arbitrary arity. We extend the result to k-universality 
in Section V, maintaining a k-bounded simulation to the chase, and performing thrifty chase steps that 
reuse sufficiently similar elements without violating UFDs. We also rely on a structural observation 
about the chase under UIDs (Theorem V.ll). Section VI eliminates the assumption that dependencies 
are reversible, by partitioning the UI Ds into classes that are either reversible or trivial, and satisfying 
successively each class following a certain ordering. 

We then generalize our result to higher-arity (non-unary) FDs in Section VII. This requires us to 
define a new notion of thrifty chase steps that apply to instances with many ways to reuse elements; the 
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existence of these instances relies on a combinatorial construction of models of FDs with a high number 
of facts but a small domain (Theorem VII.7). Last, in Section VIII, we apply a cycle blowup process 
to the result of the previous constructions, to go from acyclic to arbitrary CQs through a product with 
acyclic groups. The technique is inspired by Otto [14] but must be adapted to respect FDs. 

Complete proofs of our results are provided in the appendix. 

II. Background 

Instances. We assume an infinite countable set of elements (or values) a,b,c,... and variable names 

x,y,z, _A schema G consists of relation names (e.g., R) with an arity (e.g., \R\) which we assume is 

> 1. Following the unnamed perspective, the set of positions of R is Pos{R) •■= {/?' | 1 < / < |/?|}, and 
we define Pos(a) := URecr Pos(/?). We idenfify /?' and i when no confusion can resulf. 

A relafional instance (or model) 7 of a is a sef of ground facts of fhe form R{a) where 7? is a relation 
name and a an |7?|-luple of values. The size |7| of an insfance 7 is ifs number of facls. The active domain 
dom(7) of 7 is fhe sef of fhe elemenfs which appear in 7. For any posifion 7?‘ G Pos(a), we define fhe 
projection 71^1 {!) of 7 fo R‘ as fhe sef of fhe elemenfs of dom(7) fhaf occur af posifion R‘ in 7. For 
L C Pos(7?), fhe projection 7tL{l) is a sef of |L|-fuples defined analogously; for convenience, departing 
from fhe unnamed perspecfive, we index fhose fuples by fhe posifions of L. A superinstance of 7 is a 
(nol necessarily finile) insfance I' such fhaf 7 C I'. 

A homomorphism from an insfance 7 fo an insfance I' is a mapping h : dom(7) —> dom(7') such fhaf, 
for every facf F = R{a) of 7, fhe facf h{F) := R{h{ai),... ,/i(a|s|)) is in F. 

Constraints. We consider infegrify consfrainfs (or dependencies) which are special senfences of firsl- 
order logic. As usual in fhe relational selling, we do nol allow function symbols. The definilion of an 
insfance 7 salisfying a conslrainl Z, wrilfen 7 ^ Z, is slandard. 

An inclusion dependency ID is a sentence of fhe form T : \/xR{x \,... ,v„) ^ 3j5'(zi, • • • ,Zm), where 
Z V jrUj and no variable occurs Iwice in z- The exported variables are fhe variables of x fhaf occur 
in z, and fhe arity of fhe dependency is fhe number of such variables. This work only sludies unary 
inclusion dependencies (UIDs) which are fhe IDs wilh arily 1. If T is a UID, we write T as 7?^ C 5^, 
where RP and 5^ are fhe positions of R{x) and S{z) where fhe exporled variable occurs. For insfance, 
fhe UID \lxyR{x,y) —> is wriflen Rf . We assume wilhoul loss of generalily fhaf Ihere are 

no trivial UIDs of fhe form RP C RP. 

We say fhaf a conjunclion Zuid of UIDs is transitively closed if if is closed under implicalion by fhe 
transitivity rule: if RP C 5^ and C T’’ are in Zuid> then so is RP C T'' unless if is Irivial. The Iransilive 
closure of Zuid can clearly be computed in PTIME in Zuid> and if conlains all non-lrivial UI Ds implied 
by Zuid over finife or unreslricled inslances [7]. We say a UID T : C 5^ is reversible relalive fo Zuid 
if bolh T and ifs reverse := 5^ C are in Zuid- 

Afunctional dependency FD is a sentence of fhe form <j): Vxy (R(xi,... ,x„) AR(yi,... ,y„) A Ar'glXi = 
yi) -A Xr = yr, where L C Pos(7?) and 7?'' G Pos(7?). For brevity, we write 0 as 7?^ —> 7?''. We call ^ a 
unary functional dependency UFDif |L| = 1; olherwise if is/u'g/ier-ar/ty. For insfance, Vxx'yy'7? (x,x') A 
R{y,y) Ax' = y' —)■ V = y is a UFD, and we write if R^ -A R^. We assume fhaf \L\ > 0, i.e., we do nof 
allow nonslandard or degenerale FDs. We call (p trivial if R^ G R^, in which case (j) always holds. Two 
facls R(a) and R(b) violate a non-lrivial FD <p if Tltia) = 7lL{b) bul ^ br- 

The key dependency fc : 7?^ —)■ 7?, for L C Pos(7?), is fhe conjunction of FDs 7?^ ^ 7?'" for all 7?'' G 
Pos(7?); if is unary if \L\ = 1. If fc holds, we call L a key (or unary key) of R. 
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Queries. An atom A = R{t) consists of a relation name R and a |/?|-tuple t of variables or constants. 
A conjunctive query CQ is an existentially quantified conjunction of atoms. In this paper we focus for 
simplicity on Boolean queries (queries without free variables), but all our results hold for non-Boolean 
queries as well, by the standard method of enumerating the assignments. The size \q\ of a CQ q is its 
number of atoms. 

A Serge cycle in a Boolean CQ ^ is a sequence Ai ,xi ,A 2 ,X 2 ,... ,An,Xn with n>2, where the A,- are 
pairwise distinct atoms of q, the x, are pairwise distinct variables of q, and x,- occurs in A,- and A,_|_i for 
I <i <n (with addition modulo n, so x„ occurs in AQ. We call q acyclic if q has no Berge cycle and if 
no variable of q occurs more than once in the same atom. We write ACQ for the class of acyclic CQs. 

A Boolean CQ q holds in an instance I exactly when there is a homomorphism h from the atoms of q 
to 1 such that h is the identity on the constants of q (we call this a homomorphism from q to I). The 
image of h is called a match of q in 7. 

QA problems. We define fhe unrestricted open-world query answering problem (UQA) as follows: 
given a finile insfance 7, a conjuncfion of consfrainfs Z, and a Boolean CQ q, decide whefher fhere is a 
superinsfance of 7 fhaf salisfies Z and violafes q. If fhere is none, we say fhaf 7 and Z entail q and wrife 
l^unr q- 

This work focuses on the finite query answering problem (FQA), which is fhe varianf of open-world 
query answering where we require fhe counferexample superinsfance fo be finife; if none exisfs, we 
wrife (7,Z) |=fin q. Of course (7,Z) q implies (7,Z) |=fin q. We say a conjunction of consfrainfs Z 
is finitely controllable if FQA and UQA coincide: for every finife insfance 7 and every Boolean CQ q, 

Nunr q iff ^fin 

The combined complexity of fhe UQA and FQA problems, for a fixed class of consfrainfs, is fhe 
complexify of deciding if when all of 7, Z (in fhe consfrainf class) and q are given as inpuf. The data 
complexity is defined by assuming fhaf Z and q are fixed, and only 7 is given as inpuf. 

Chase. We say fhaf a superinsfance I' of an insfance 7 is universal for consfrainfs Z if 7' ^ Z and 
if for any CQ q, I' \= q iff (7,Z) q- We now recall fhe definition of fhe chase [1, 13], a sfandard 
consfrucfion of (generally infinile) universal superinsfances. We assume fhaf we have fixed an infinife 
sef Af of nulls which is disjoinf from dom(7). We only define fhe chase for fransifively closed UIDs, 
which we call fhe UID chase. 

We say fhaf a fad = R{a) of an insfance 7 is an active fact for a DID x \ RP F S‘^ if, wrifing 
T : \lxR{x) —> 3yS{z), fhere is a homomorphism from R(x) fo Ty, buf no such homomorphism can 
be exfended fo a homomorphism from {7?(x),5'(z)} fo 7. In fhis case we say fhaf ap wants fo occur 
af position S‘^ in 7, written ap € \Nants{fS‘^), and fhaf we want fo apply fhe UID T fo a^, written 
ap G Wants(7,T). Note fhaf Wants(7,T) = 7ri?p(7)\%?(7). 

The resulf of a chase step on fhe acfive fad Ty, for T in 7 (we call fhis applying x fo Ffi is fhe 
superinsfance I' of 7 obfained by adding a new facf F^ = S{b) defined as follows: we sef bq ■■= ap, 
which we call fhe exported element (and S‘^ fhe exported position of Ff), and use fresh nulls from J\f 
fo insfanfiafe fhe exisfenfially quantified variables of x and complete F^, we say fhe corresponding 
elemenfs are introduced af F^. This ensures fhaf T^a is no longer an acfive facf in 1' for x. 

A chase round of a conjuncfion Zuid of UIDs on 7 is fhe resulf of applying simulfaneous chase steps 
on all active fads for all UIDs of Zuid> using disfincf fresh elemenfs. The UID chase Chase(7,ZuiD) 
of 7 by Zuid is the (generally infinite) fixpoint of applying chase rounds. It is a universal superinstance 
for Zuid [9]. 

As we are chasing by transitively closed UIDs, if we perform the core chase [13] rather than the 
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DID chase defined above, we can ensure the following Unique Witness Property: for any element 
a G dom(Chase(/,ruiD)) and position RP of a, if two different facts of Chase(/,ruiD) contain a at 
position RP, then they are both facts of 1. In our context, however, the core chase matches the UID 
chase defined above, except at the first round. Thus, modulo the first round, by Chase(/,ruiD) we refer 
to the UID chase, which has the Unique Witness Property. See Appendix A for details. 

Finite closure. Rosati [16, 18] showed that, while conjunctions of IDs are finitely controllable, even 
conjunctions of UIDs and FDs may not be. However, Cosmadakis et al. [8] showed how to decide in 
PTIME the finite implication problem for UIDs and FDs: given a conjunction £ of such dependencies, 
decide whether a UID or FD is implied by £ over finite instances. The finite closure of £ is the set of 
the UIDs and FDs thus implied by £ in the finite. 

Rosati [17] later showed that the finite closure could be used to reduce UQA to FQA for some 
constraints on relations of arity at most two. Following the same idea, we say that a conjunction of 
constraints £ is, finitely controllable up to finite closure if for every finite instance I, and Boolean CQ q, 
(/,£) q iff (/,£0 Nunr <1, where £' is the finite closure of £. This implies that we can reduce FQA 
to UQA, even if finite controllability does not hold. 

III. Main Result and Overall Approach 

We study open-world query answering for FDs and UIDs. For unrestricted query answering (UQA), 
the following is already known, from bounds on UQA for UIDs: 

Proposition III.1. UQA for FD^ and UID 5 has PTIME data complexity and NP-complete combined 
complexity. 

However, for the finite case, even the decidability of FQA for FDs and UIDs is not known. Here is 
our main result, which is proved in the rest of this paper: 

Theorem III.2 (Main theorem). Conjunctions of FD 5 and UID 5 are finitely controllable up to finite 
closure. 

From these two results, and an efficient computation of the closure, we deduce that the complexity 
of FQA matches that of UQA (see Appendix B.3): 

Corollary III.3. FQA for FD 5 and UID 5 has PTIME data complexity and NP-complete combined 
complexity. 

Ill.l. Rephrasing with universal models 

We prove the main theorem via the notions of k-sound and k-universal instances. 

Definition III.4. For k G N, we say that a superinstance I of an instance Iq is k-sound for constraints 
£ (and for Iq) if for every constant-free CQ q of size < k such that I\=q,we have (/o,£) |=unr We say 
it is k-universal if the converse also holds: I \= q whenever (/o,£) ^unr 

The assumption that q is constant-free is without loss of generality: we can always assume that, for 
each constant c G dom(/o), a fact Pc{c) has been added to Iq for a fresh unary relation Pc, and c was 
replaced in ^ by a existentially quantified variable with the atom Pc{xc) added to q. So for simplicity 
we assume from now on that queries are constant-free. 

Theorem III.2 is implied by the following (see Appendix B.2): 
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Theorem III.5 (Universal models). For every conjunction Z o/ FD^ ZpD cind UID 5 Zuid closed under 
finite implication, for every finite instance /q that satisfies Zpo, for any ^ G N, there exists a finite 
superinstance I of Iq that is k-sound for Z and satisfies Z (and hence is k-universal). 

The faet that sueh an I is ^-universal is beeause any superinstanee of that satisfies Z must satisfy 
all CQs q sueh that (/o,Z) |=unr q, by definition of ^unr- 
We now fix the eonjunetion Z of FDs ZpD and UIDs Zuid- We assume that Z is elosed under finite 
implieation; in partieular, ZpD and Zuid in isolation are elosed under implieation, whieh implies that 
Zuid is transitively elosed. We also fix the instanee Iq sueh that Iq ^ ZpD, and the maximal query size 

k€n. 

Our goal in the rest of this paper is to eonstruet the finite ^-sound superinstanee of Iq that satisfies Z, 
thus proving the Universal Models Theorem and henee the Main Theorem. 

III.2. Restricting to ACQs, UFDs, and reversible constraints 

We first prove the Universal Models Theorem for a restrieted elass of queries and dependeneies, whieh 
we now define. We will lift these restrietions later. 

First, we define Zupd to be the unary FDs of ZpD, and write Zu == Zupd A Zuid- Note that, as we 
assumed that Z is elosed under finite implieation for UFDs and UIDs, the eharaeterization of [8] implies 
that Zu also is. We will first eonstruet a ^-sound superinstanee that only satisfies Zu; in Seetion VII we 
will show how to adapt the proeess to also satisfy Z. 

Seeond, we will first eonstruet a superinstanee that is ^-sound only for aeyelie Boolean queries; in 
Seetion VIII we will show how to make the resulting superinstanee suffieiently aeyelie to be sound for 
eyelic queries as well. 

Henee, in Seetions IV, V and VI, we prove the following weakening of the Universal Models Theo¬ 
rem. The restrietions will be lifted in Seetions VII and VIII. 

Theorem III.6 (Aeyelie unary universal models). There exists a finite superinstanee of Iq that satis¬ 
fies Zu and is k-sound for Zu and ACQ (and hence k-universal for Zu and ACQ). 

To prove the Aeyelie Unary Universal Models Theorem, in Seetions IV and V, we will assume the 
following eondition on the strueture of the dependeneies: 

reversible: The following holds about Zu: 

• all UIDs in Zuid are reversible (remember this means that the reverse of any T G Zuid 
is also in Zuid); 

• for any positions and oeeurring in UIDs of Zuid. if R^ R'^ is in Zupd then so is 
R^i RP. 

Intuitively, assumption reversible is eonneeted to the finite elosure eharaeterization of [8], whieh adds 
to Zu the reverses of any UIDs and UFDs that form a eertain eyelie pattern. 

Working under assumption reversible, Seetion IV proves an even weaker version of the Aeyelie 
Unary Universal Models Theorem, whieh replaees k-soundness by weak-soundness; Seetion V proves 
the aetual theorem. Assumption reversible is lifted in Seetion VI to eonelude the proof. 


IV. Weak-Soundness and Reversible UIDs 

The goal of this seetion is to prove the Aeyelie Unary Universal Models Theorem (Theorem III.6) under 
assumption reversible, replaeing k-soundness by weak-soundness. 
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Definition IV.l. A superinstance I' of an instance I is weakly-sound if the following holds: 

• for any a G dom(/) and RP G Pos(a), if a G KRpf), then either a G llRpf) ora€ Wants(/,/?^); 

• for any a G dom(/')\dom(/) and RP,SP G Pos(a), ifa G TlRpf) and a G Tts‘i{l') then RP = S‘t or 
RP C 5^ is in Tuid- 

Intuitively, a superinstance is weakly-sound if existing elements were only added to positions where 
they wanted to appear, and new elements only occur at positions which are connected in Tuid- This 
section shows the following: 

Proposition IV.2 (Acyclic unary weakly-sound models). Under assumption reversible, there exists a 
finite superinstance of Iq that satisfies Zu is weakly-sound. 

The proposition itself will not be reused in the sequel, but the proof introduces some useful concepts 
to prove the actual Acyclic Unary Universal Models Theorem in Section V. 

IV.l. Binary signatures and balanced instances 

For simplicity, we first focus on a simplified case with a binary signature, making the following as¬ 
sumption that will be lifted later in this section: 

binary: all relations have arity 2 and Zufd contains the U FDs -^R^ and R^ —> R^ for any relation R. 

Our approach to construct a weakly-sound superinstance 1' of /q that satisfies Zu is then to perform 
a completion process that adds new (binary) facts to connect together elements. As all possible UFDs 
hold, /' can only contain a new fact R{ai,b 2 ) if, for i G {1,2}, a, ^ %,(/o), so that if a, G dom(/o) then 
a, G Wants(/o,/?*) by weak soundness. 

One easy situation is when Iq is balanced: for every relation R, we can construct a bijection between 
the elements that want to be in and those that want to be in R^: 

Definition IV.3. An instance I is balanced if for every two positions RP and R‘t such that RP —> R‘t and 
RP RP are in Zufd, we have |Wants(/,/?^)| = |Wants(/,/?^)|. 

If /q is balanced, we can show the Acyclic Unary Weakly-Sound Models Proposition under assump¬ 
tion binary, simply by pairing together elements, without adding any new ones: 

Proposition IV.4. Assuming binary and reversible, any balanced finite instance I satisfying Zufd has 
a finite weakly-sound superinstance 1' that satisfies Su, with dom(/') = dom(/). 

However, our instance may not be balanced. The idea is then to balance it by adding “helper” 
elements and assigning them to positions, as the following example shows: 

Example IV.S. Consider three binary relations R, S, T, with the UID 5 C S', C r', C 
and their reverses, and the FD 5 prescribed by assumption binary. Consider Iq := {R{a,b)}. We have 
a G Wants(/o,r^) and b G Wants(/o,S'); however Wants(/o,S^) = Wants(/o,T') = 0, so is not 
balanced. 

Still, we can construct the weakly-sound superinstance I := {R{a,b),S{b,c),T{c,a)} that satisfies 
the constraints. Intuitively, we have added a “helper” element c and “assigned” it to the positions S' 
and T^, which are connected by the UID 5 . 

We now formalize this idea of constructing weakly-sound superinstances where the domain is aug¬ 
mented with helper elements. We first need to understand at which positions the helpers can appear to 
avoid violating weak-soundness: 



Definition IV.6. For any two positions RP and S‘^, we write 7?^ ~id when = S‘^ or when RP C S‘^, 
and hence C RP by assumption reversible, are in Suid- 

As Tuid is transitively closed, ~id is an equivalence relation. Our idea to construct weakly-sound 
superinstances is thus to first decide on the helpers that we want to add, and the ~iD-class to which 
we want to assign them, following the definition of weak-soundness. We represent this choice as a 
partially-specified superinstance, or pssinstance: 

Definition IV.7, A pssinstance of an instance I is a triple P = where FL is a finite set of helpers 

and X maps each hGH to an ^\Q-class X{h). 

We define Wants(/’,/?^) := Wants(/,/?^) U{h ^'H\RP ^ This allows us to talk of P being 

balanced following Definition IV. 3. 

A superinstance I' of I is a realization of P j/dom(/') = dom(/) U FL, and, for any fact R(a) of l'\I 
and RP G Pos{R), we have ap G \Nants{P,RP). 

Example IV.8. In Example IV.5, a pssinstance offi is P ■= (/o,{c},A) where X{c) ■= and I 

is a realization of P. 

It is always possible to balance an instance by adding helpers: 

Lemma IV.9 (Balancing), For any finite instance I, if I satisfies Eufd then it has a balanced pssin¬ 
stance. 

From there, we can construct realizations like we constructed superinstances in Lemma IV.4. 

Lemma IV.IO (Binary realizations). For any balanced pssinstance P of an instance I that satisfies 
Lufd. we can construct a realization ofP that satisfies Zu- 

We then observe that realizations are weakly-sound superinstances of /q. 

Lemma IV,11 (Binary realizations are completions). Iff is a realization of a pssinstance of I then it 
is a weakly-sound superinstance of I. 

We have thus proved the Acyclic Unary Weakly-Sound Models Proposition under assumptions binary 
and reversible, using the completion process formed by combining the three above lemmas. 

IV.2. Arbitrary arity and piecewise realizations 

We now lift assumption binary (but retain assumption reversible). We show how to generalize the 
previous constructions to the arbitrary arity case. Contrary to the binary situation, we will see later that 
the resulting completion process needs to assume that a certain saturation process has been applied 
to Iq beforehand. 

The definition of balanced instances (Definition IV.3) generalizes to arbitrary arity, and we can show 
that the Balancing Lemma (Lemma IV.9) still holds. We keep the definition of pssinstance (Defini¬ 
tion IV.7) but need to change the notion of realization. We replace it by piecewise realizations, which 
are defined on subsefs of positions fhaf are connecfed in Zufd- 

Definition IV,12, For any two positions RP and Rt, we write RP g->fun R'^ whenever RP —)■ R‘t and 
R^ RP are in Zufd- 

By fransifivify of Zufd> •^fun is clearly an equivalence relation. We number fhe G-^puN-dasses of 
Pos(a) as rii,... ,n„ and define piecewise instances by fheir projections fo fhe IT,-: 
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Definition IV.13. A piecewise instance is an n-tuple PI = (^i ,... ,Kn), where each Ki is a set of |n, |- 
tuples, indexed by Hi for convenience. The domain of PI is dom{PI) := lJ,dom(^,). For 1 <i <n and 
RP G Ui, we write tirp^PI) ■= nRp{Ki). 

We use this to define piecewise realizations of pssinstanees: 

Definition IV.14. A piecewise instance PI = (^i,... ,^n) is a piecewise realization of the pssinstance 
P = {I,n,X)if: 

• Ttiiiil) 'T Kifor all !</<«, 

• dom(/’/) = dom(/) UFi, 

• for all !</<«, for all RP G IT,, for every tuple a G Ki\Kiii {!), we have ap G Wants(P,/?^). 

In order to generalize the Binary Realizations Lemma (Lemma IV. 10), we need to talk of a pieeewise 
instanee PI “satisfying” Zu- For ^ufd> we require that PI respeets the UFDs within eaeh OFUN-class. 
For ruiD> we define if direefly from fhe projeefions of PL 

Definition IV.15. A piecewise instance PI is T.uvY)-compliant if for all \ <i <n, there are no two 
tuples a in Ki such that ap = bp for some RP G IT,-. 

PI is l.\]iv)-compliant //'Wants(/’/, r) := nRp{PI)\ns‘i{PI) is empty for all T G Zuid- 
PI is l.\]-compliant if it is Zufd- 'LmD-compliant. 

We ean fhen generalize the Binary Realizations Lemma: 

Lemma IV.16 (Realizations). For any balanced pssinstance P of an instance I that satisfies Zufd. we 
can construct a 'L\]-compliant piecewise realization of P. 

Example IV.17, Consider a 4-ary relation R and the UID 5 T : C z': R^ C and their reverses, 
and the UFD 5 ^ : R^ —)■ R^, 0': R^ —> R"^ and their reverses. We have ITi = {R^R^} and YI 2 = {R^jR'^}. 
Consider Iq := {R{a,b,c,d)}, which is balanced, and the balanced pssinstance P := (/o,0, A), where X 
is the empty function. A -compliant piecewise realization ofP is PI := {{{a,b),{b,a)},{{c,d), (d,c)}) 

We now transform the Zu-compliant pieeewise realization PI into a weakly-sound superinstanee, 
generalizing the “Binary Realizations Are Completions” Lemma (Lemma IV. 11), and eompleting the 
deseription of our eompletion proeess. The idea is to expand eaeh tuple t of eaeh Ki to an entire faet Ft 
of the eorresponding relation. 

However, to fill the other positions of Ft, we will need to reuse existing elements of Iq. For this, we 
want Iq to eontain some R-faet for every relation R that oeeurs in Chase(/o,ZuiD)- 

Definition IV.18. A relation R is achieved (by I and Zuid) if there is some R-fact in Chase(/,ZuiD)- 
A superinstance F of an instance I is relation-saturated (for Zuid) if every achieved relation (by I 
and Zuid) occurs in F. 

Example IV.19. Consider two binary relations R and T and a unary relation S, the UID 5 T : 5^ C R\ 
z': R^ CT^ and their reverses, no UFD 5 , and the non-relation-saturated instance Iq ■= {5'(a)} which 
is trivially balanced. 

P := (/o,0,A), with X the empty function, is a pssinstance of I, and PI := ({(a)},0, {(a)},0,0), 
where Hi and 113 the ^vvtt-dasses of R^ and is a -compliant piecewise realization of P. 
However, we cannot easily complete PI to a superinstance of Iq satisfying z and z', because, to create 
the fact R(a,»), we need to create an element to fill position R^, and this would introduce a violation 
of z'. Intuitively, this is because Iq is not relation-saturated. 

Consider instead the instance I\ := Iq U {S(c),R{c,d),T (d)}. We can complete I\ to satisfy z and z' 
by adding the fact R(a,d), reusing the element d to fill position R^. 
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Clearly, initial chasing on Iq ensures relation-saturation: 

Lemma IV.20 (Relation-saturated solutions). The result of performing sufficiently many chase rounds 
on any instance I is relation-saturated. 

Relation-saturation ensures that we can reuse existing elements when completing PI. This allows us 
to perform the last step of the completion process: 

Lemma IV.21 (Using realizations to get completions). For any finite relation-saturated instance I that 
satisfies 'Lupd, ffom a 'Lu-compliant piecewise realization PI of a pssinstance off we can construct a 
finite weakly-sound superinstance ofl that satisfies Zu- 

We can now prove the Acyclic Unary Weakly-Sound Models Proposition. Consider our initial finite 
instance Iq, that satisfies Zufd> and chase if fo a finife relafion-safurafed superinsfance /q using fhe 
Relafion-Safurafed Solufions Lemma. By fhe Unique Wifness Properfy, Iq sfill safisfies Zufd> and if is 
clearly a weakly-sound superinsfance of Iq. 

Now, perform fhe completion process: consfrucf a balanced pssinsfance P of Iq using fhe Balancing 
Lemma (Lemma IV.9), and a finife Zu-complianf piecewise realization PI of P by fhe Realizations 
Lemma (Lemma IV. 16). Then, use fhe realization PI wifh Lemma IV.21 fo consfrucf fhe finife weakly- 
sound superinsfance I of Iq fhaf safisfies Zu- k is clearly also a weakly-sound superinsfance of Iq, so fhe 
resulf is proven. 


V. ^-Soundness and Reversible UIDs 

We now move from weak-soundness fo k-soundness, fo prove fhe Acyclic Unary Universal Models 
Theorem (Theorem III.6), still making assumption reversible. 

We firsf infroduce fhe nofion of aligned superinstances fhaf we use fo mainfain k-soundness, and 
give fhe safurafion process fhaf generalizes relafion-safurafion. We fhen define a nofion of thrifty chase 
steps, and a completion process fhaf uses fhese chase sfeps fo repair UID violafions in fhe insfance. 

V.l. Aligned superinstances and fact-saturation 

We ensure k-soundness by mainfaining a k-bounded simulation from our superinsfance of Iq fo fhe 
chase Chase(/o,ZuiD)- Indeed, Chase(/o,ZuiD) is a universal model for Zuid, and if safisfies Zpo (by 
fhe Unique Wifness Properfy, and because Iq does). Hence, if is in parficular k-sound for Z. Now, as 
acyclic queries of size < k are preserved fhrough k-bounded simulations, superinsfances of Iq wifh a 
k-bounded simulation fo Chase(/o,ZuiD) are indeed k-sound for ACQ. 

Definition V.l. For I, I' two instances, a G dom(/), b G dom(/'), and n G N, we write {fa) <„ {!' ,b) if 
for any fact R{a) ofl with ap = a for some G Pos(R), there exists a fact R{b) off such that bp = b, 
and {faq) <„-! {f,bq)for all R‘^ G Pos(/?). The base case (fa) <q {f,b) always holds. 

An n-bounded simulation from I to f is a mapping sim such that for all a G dom(/), {fa) <„ 
(/',sim(a)). 

We write a b for a,b G dom(/) if both {fa) <„ {fb) and {fb) <„ {fa); this is an equivalence 
relation on dom(/). 

Lemma V.2. For any instance I and ACQ q of size < n such that I \= q, if there is an n-bounded 
simulation from I to f, then f \= q. 
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We accordingly give a name to superinstances of /q that have a ^-bounded simulation to the chase. 
For convenience, we also require them to be finite and satisfy Tufo- For technical reasons we require 
that the simulation is the identity on /q, that it does not map other elements to Iq, and that elements 
occur in the superinstance at least at the position where their sim-image was introduced in the chase: 

Definition V.3. An aligned superinstance J = (/, sim) oflo is a finite superinstance I offi that satisfies 
^UFD. tind a k-bounded simulation s\m from I to Chase(/o,5iuiD) such that s\rr\^j^ is the identity and 
sim|(/\/g) maps to Chase(/o,ruiD) Vo- 

Further, for any a € dom(/)\dom(/o), letting RP be the position where sim(a) was introduced in 
Chase(/o,ruiD). we require that a € TtRp{I). 

Before we perform the completion process that allows us to satisfy Tuid, we need to perform a 
saturation process, like relation-saturation in the previous section. Instead of achieving all relations, 
we want the aligned superinstance to achieve all/act classes: 

Definition V.4. A fact class is a pair {RP,C) of a position RP € Pos(a) and a \R\-tuple of c^fclasses 
of elements o/Chase(/o,ruiD)- The dependency on k is omitted for brevity. 

The fact class of a fact F = R(a) o/Chase(/o,ruiD) Vo is {RP,C), where ap is the exported element 
ofF and Ci is the ':^k-class of at in Chase(/o, Tuid)/ or aZZ G Pos{R). 

A fact class {RP,C) is achieved if it is the fact class of some fact of Chase(/o,ruiD) Vo- write 

AFactCI/or the set of all achieved fact classes (for brevity, the dependence on Iq, Tuid. and k is omitted 
from notation). 

An aligned superinstance J = (/,sim) is fact-saturated if, for any achieved fact class D = (RP ,C) in 
AFactCI, there is a fact Fd = R{a) of 1 \Iq such that sim(a,) G Cifor all G Pos(R). 'We say that Fjj 
achieves D in J. 

Lemma V.5. For any initial instance fi, set Luid of UID^, and k G N, AFactCI is finite. 

We now define our safurafion process: chase until all facl classes are achieved, which is possible in 
finifely many rounds thanks to the above lemma. The result is easily seen to be a fact-saturated aligned 
superinstance: 

Lemma V.6 (Fact-saturated solutions). The result I of performing sufficiently many chase rounds on 
is such that Jq = (/,id) is a fact-saturated aligned superinstance o//o- 

We thus obtain a fact-saturated aligned superinstance Jq of Iq, which we now want to complete to 
one that satisfies Luid- 

V.2. Fact-thrifty completion 

Our general method to repair UID violations in Jq is to apply a form of chase step on aligned superin¬ 
stances, which may reuse elements: thrifty chase steps. To define fhem, we firsf distinguish dangerous 
and non-dangerous positions, which defermine how we may reuse elemenfs when chasing. 

Definition V.7, We say a position S'' G Pos(a) is dangerous for a position S‘^ V S'' if S'' S‘^ is 

in Lufd. ond write S'' G Dng(5'^). Otherwise, 5''' is non-dangerous, written S'' G NDng(5'^). Note that 
{5V U Dng(5V U NDng(5V = Pos(5). 

Definition V.8 (Thrifty chase steps). Let J = (7,sim) be an aligned superinstance oflo, let x :RP FS‘^ 
be a UID o/Zuid. and let Fa = R{a) be an active fact for T in I. 
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Because sim is a l-bounded simulation, sim(fl;p) G 7rsp(Chase(/o,ruiD)). because the chase sat¬ 
isfies X, there is a fact = S{b') in Chase(/o,ruiD) with b'^ = sim(ap); we call the chase witness. 

Applying a thrifty chase step on F^for x yields an aligned superinstance J' = (F,sim'). We define 
I' as I plus a new fact = S(b), where bq = ap and the br for S’’ S‘^ may be elements o/dom(7) or 

fresh elements. We require that: 

• for S’’ G NDng(5'^), br G %'■(/) (so they are not fresh) 

• for S'" G Dng(5'^), br ^ TtsfJ) (so they may be fresh) 

• for S'' / 5^, ifbr is not fresh then sim(Z?^) b'r. 

We define sim' by extending sim to dom(7').- we set sim'(^r) == b'^ whenever br is fresh. 

A fact-thrifty chase step is a thrifty chase step where we choose one fact F^ = S{c) of J\Iq that 
achieves the fact class of F,^ (that is, sim(c,) ~A: b[ for all i), and use F^ to define br '■= Cr for all 
S''e mng{S‘>). 

The chase step is fresh ifbr is fresh for all S'' G Dng(5'^). 

Thrifty chase steps may in general violate liuFD, but fact-thrifty chase steps never do. For this reason, 
we will only use fact-thrifty chase steps in this section. The point of working with fact-saturated aligned 
superinstances is that we can ensure that a suitable F^ always exists. We thus claim: 

Lemma V.9 (Fact-thrifty chase steps). For any fact-saturated aligned superinstance J, the result J' of 
a fact-thrifty chase step on J is indeed a well-defined aligned superinstance where the former active 
fact Fa is no longer active. 

We now claim that we can expand fact-saturated superinstances to satisfy Luid, using fact-thrifty 
chase steps: 

Proposition V.IO (Fact-thrifty completion). Under assumption reversible, for any fact-saturated aligned 
superinstance J of 1 q, we can expand J by fact-thrifty chase steps to a fact-saturated aligned superin¬ 
stance J' oflo that satisfies Luid- 

This proposition allows us to prove the Acyclic Unary Universal Models Theorem (Theorem III. 6) 
under assumption reversible. Indeed, consider the fact-saturated aligned superinstance Jq produced by 
the Fact-Saturated Solutions Lemma (Lemma V.6). Applying the Fact-Thrifty Completion Proposition 
to Jq yields a fact-saturated aligned superinstance J', which is a finite k-sound superinstance of Iq that 
satisfies Lufd and safisfies Luid- 

The resf of fhis section skefches fhe proof of fhe Proposition (see Appendix D.5 for fhe full proof). 
The idea is fo consfrucf, as in Secfion IV, a balanced pssinsfance P of fhe inpuf aligned superinsfance J, 
and a Zu-complianf piecewise realizafion PI of P. Now, instead of complefing fhe facls of PI fo add 
fhem direcfly fo J, we add fhem one by one, using facl-lhrifly chase steps, fo ensure fhaf alignedness is 
preserved. 

The only problemafic poinf is fhaf PI could conned fogefher elemenfs fhaf have dissimilar sim- 
images, violafing alignedness. However, we show fhaf, up fo chasing for k -|- 1 rounds on fhe inifial J 
wifh fresh facl-lhriffy chase sfeps before consfrucfing F, we can ensure whaf we call k-reversibility: all 
elemenfs fhaf wanf fo be af some position in J have a sim-image whose ~^:-class only depends on R^. 
Once we have ensured fhis, we can essentially slop worrying aboul sim-images, because respecling 
weak-soundness, as PI does, is sufficienl. 

The reason why k-\-l chasing rounds suffice fo ensure fhis is by a general sfruclural observation on 
fhe UID chase: when fhe lasf k UIDs applied fo an elemenf a of Chase(/o,ruiD) are reversible (as is fhe 
case here, by assumption reversible), fhe ~<;-class of a only depends on fhe ~iD-class of fhe position 
where if was infroduced, and nol on ifs exacf history. Formally: 
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Theorem V.ll (Chase locality theorem). For any instance 1 q, transitively closed set 0 /UID 5 Tuid. cind 
n G N,/or any two elements a and b respectively introduced at positions RP and S‘^ in Chase(/o,ruiD) 
such that RP ~id 5/ if the last « UID 5 applied to create a and b are reversible, then a b. 


VI. Arbitrary UIDs: Lifting Assumption reversible 

This section concludes the proof of the Acyclic Unary Universal Models Theorem (Theorem III. 6 ) by 
removing assumption reversible. We do so by splitting Tuid in subsets that can be satisfied sequentially: 

Definition VI.1. For any T, x' G Tuid. wo write T ^ x' when we can write x = RP and x' = S’’ F T" 
with 5“^ / S'', and the UFD 5''' —> 5^ is in IIufd- An ordered partition {P\,...,Pn) o/Zuid A a partition 
o/^uiD (i-^; ^uiD = that for any x G Pi, x' G Pj, if X ^ x' then i < j. 

The notion of ordered partition is useful because thrifty chase steps can only cause new UID viola¬ 
tions at the dangerous positions of the new fact. This implies the following: 

Lemma VI.2, Let J be an aligned superinstance of lo and J' be the result of applying a thrifty chase 
step on J for a UID T o/Zuid- Assume that a DID t' o/Zuid was satisfied by J but is not satisfied by J’. 
Then X ^ x'. 

Hence, given an ordered partition of Zuid> once we have satisfied fhe UIDs of fhe firsf i classes 
P\,... ,Pi, fhen fhis properly is preserved while we do Ihrifly chasing wifh Pj, j > i. So if we can satisfy 
each Pi individually wifh Ihrifly chase steps, fhen we can salisfy Zuid by satisfying Pi,... 

Of course, fhe poinl of parfilioning Zuid is lo be able lo conlrol Ihe slruclure of Ihe UIDs in each 
class: 

Definition VI.3. We call P C Zuid reversible if it is transitively closed (as Zuid A) and satisfies as¬ 
sumption reversible. 

We say P C Zuid A trivial if we have P = {x} for some X G Zuid such that X X. An ordered 
partition is manageable if all of its classes are either reversible or trivial. 

IfP C Zuid is reversible, then the previous section describes how to complete with thrifty chase steps 
any fact-saturated aligned superinstance of /q to one that satisfies P. If P is trivial, it follows directly 
from Lemma VI.2 that we can satisfy it: 

Corollary VI.4. For any trivial class {t}, performing one chase round on an aligned fact-saturated 
superinstance J oflo by fresh fact-thrifty chase steps for x yields an aligned superinstance J' oflo that 
satisfies X. 

We now claim that we can construct a manageable partition of Zuid- We build it as a topological sort 
of the strongly connected components (SCCs) of the directed graph on Zuid defined by with the 
technical complication that SCCs must be closed under U1D reversal. The construction relies on the 
fact that Zuid is closed under finite implication, as characterized by Cosmadakis et al. [ 8 ]. 

Lemma VL5, Any conjunction Zuid 0 /UID 5 closed under finite implication has a manageable parti¬ 
tion. 

Example VL6. Consider the U\Ds Xr : R^ Q R^, Xs : C S^, X : R^ F S^, and the UfDs <pR : R^ ^ R^, 
^5 : — 7 - S/ : R^ -^R^, and —)■ S^. The UID 5 xf^ and and UFD 5 andR^ — 

S^, are finitely implied. A manageable partition is ({Tr, {t}, {T 5 , where the first and 

third classes are reversible and the second is trivial. 
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We can now conclude the proof of the Acyclic Unary Universal Models Theorem (Theorem III.6). 
We first note that the Fact-Saturated Solutions Lemma (Lemma V.6) does not use assumption reversible, 
so we apply it (with Tuid) to obtain from /q an aligned fact-saturated superinstance J\ of /q. This is the 

saturation process. 

We now satisfy Tuid by a completion process. Build a manageable partition (Fi of Tuid. by 

Lemma VI.5. Now, for !</<«, use fact-thrifty chase steps by UIDs of F, to extend the fact-saturated 
aligned superinstance 7; to a larger one 7,+i that satisfies F,. If Pi is frivial, use Corollary VI.4. If F, is 
reversible, apply fhe Facf-Thrifly Complefion Proposifion (Proposition V.IO), faking Tuid to be P,-. By 
Lemma VI.2, fhe resulf 7,+i satisfies 

Hence fhe resulf of fhe complefion process is an aligned superinsfance of /q fhaf satisfies Tuid; 
as an aligned superinsfance, if is also finile, satisfies Tufo, and is fc-sound for ACQ; so if is ^-universal 
for Zu and ACQ. This concludes fhe proof of fhe Acyclic Unary Universal Models Theorem. 

VII. Higher-Arity FDs 

We now boofsfrap fhe Acyclic Unary Universal Models Theorem (Theorem III.6) fo fhe Universal 
Models Theorem (Theorem III.5). The firsl step is fo change our consfrucfion fo avoid violafing higher- 
arify FDs, namely, show fhe following, which applies fo £ = Zuid AZfd rafher fhan Zu = Zuid AZufd: 

Theorem VII.1 (Acyclic universal models). There is a finite superinstance of Iq that is k-universal 
for Z and ACQ queries. 

The problem fo address is fhaf our complefion process fo satisfy Zuid was defined wifh facf-fhriffy 
chase steps, which reuse elemenfs from fhe same facfs af fhe same posifions mulfiple times. This may 
violate ZpD, and we can show fhaf is fhe only poinf where we do so in fhe consfrucfion. 

The goal of fhis section is fo define a new version of fhriffy chase steps fhaf preserves Zpo rafher 
fhan jusf Zufd; we call fhem envelope-thrifty chase steps. We firsl describe fhe new saluralion process 
designed for fhem. Second, we define how Ihey work, redefine fhe complefion process of fhe previ¬ 
ous seclion fo use fhem, and use Ibis new complefion process fo prove fhe Acyclic Universal Models 
Theorem above. 

VII.1. Envelopes and saturation 

We sfarf by defining a new notion of salurafed inslances. Recall fhe notions of facl classes (Defini¬ 
tion V.4) and Ihrifly chase sleps (Definifion V.8). When a Ihrifly chase slep wanls fo create a facl 
whose chase wilness iv has facl class {R^,C), if needs elemenfs fo reuse inFn al positions of NDng(/?t’). 
They musl have fhe righl sim-image and musl already occur al fhe posifions where Ihey are reused. 

Facl-lhrifly chase sfeps reuse a luple of elemenfs from one facl /y, and Ihus apply to fact-saturated in¬ 
stances wifh one facl for each class. Our new nolion of envelope-lhrifly chase steps will need salurafed 
inslances fhaf have multiple reusable fuples. A sel of such fuples is called an envelope for {RP,C)\ 

Definition VII.2. Consider D = {RP,C) in AFactCI, and write O ■= NDng(/?t’). An envelope E for D 
and for an aligned superinstance J = (/,sim) offi is a non-empty set of \0\-tuples indexed by O, with 
domain dom(/), such that: 

• for every FD 0 : —> /?'' o/Zfd with R^ C O and R’^ G O, E satisfies 0 (seeing its tuples as facts 

on O); 

• foreveryfD^ :R^^R’'of'LY!Y)withR^COandR''^0,forallt,t'£E, 71 ^ 1 ( 1 ) = 7tj{L(t') implies 
t = t'; 


15 



• for every a G dom(£'), there is exactly one position G O such that a G TtRq{E); and then we 
also have a G TtRq{J); 

• for any fact F = R{a) of J and 7?^ G O, ifaq G TtRq{E), then F achieves D in J and Ttoid) £ E. 

Intuitively, the tuples in the envelope E satisfy the UFDs of Tufd within NDng(/?^), and never 
overlap on positions that determine a position out of NDng(/?P). Further, their elements already oeeur 
at the positions where they will be reused, and have the right sim-image for the faet elass D. To simplify 
the reasoning, we also impose that eaeh element of E is used at only one position, and oeeurs at that 
position only in faets whieh aehieve D and whose projeetion to NDng(/?^) is in E. 

Depending on O, it may be possible to use a singleton tuple as the envelope, like faet-thrifty ehase 
steps, and not violate Zfd- The elass is then safe. Otherwise, we foeus on the envelope tuples whieh do 
not appear in the instanee yet. 

Definition VII.3. We call (R^^C) in AFactCI safe if there is no FD R^ —)■ /?'' in Zpo with R^ C N Dng(/?^) 
andR’' ^ NDng(/?^). 

Letting E be an envelope for (RP,C) and J be an aligned superinstance, the remaining tuples of E 
are £'\7rNDng(Rp)(T) if{R^,C) is unsafe, and E if it is safe. 

We now introduee the notion of global envelopes, that give us one envelope per elass of AFactCI. 
This leads to our new notion of saturation: a saturated instanee has a global envelope with many re¬ 
maining tuples in the unsafe elasses. Note that this implies faet-saturation. 

Definition VII.4. A global envelope £ for an aligned superinstance 7 = (/,sim) of Iq is a mapping 
from each D G AFactCI to an envelope £{D)for D and J, such that the envelopes have pairwise disjoint 
domains. 

We call J n-envelope-saturated if it has a global envelope £ such that £{D) has > n remaining 
tuples for all unsafe D G AFactCI. J is envelope-saturated if it is n-envelope-saturated for n > 0, and 
envelope-exhausted otherwise. 

We now justify that we ean make arbitrarily saturated superinstanees of Iq (the switeh to Iq is a 
teehnieality): 

Proposition VII.5 (Suffieiently envelope-saturated solutions). For any K gN and instance Iq, we can 
build a superinstance /q of Iq that is k-sound for CQ, and an aligned superinstance J of Iq that satis¬ 
fies Zfd and is {K\ J\)-envelope-saturated. 

Example VII.6. For simplicity, we work with instances rather than aligned superinstances. Consider 
Iq ■= {S{a),T (z)}, the UID^ T : 5^ QR^ and z' :T^ C R^ for a 3-ary relation R, and the FD 0 : R^R? -G 
R^. Consider / := /q U {R{a,b,c)} obtained by one chase step of T on S{a). It would violate 0 to 
perform a fact-thrifty chase step of £ on z to create R{z,b,c), reusing {b,c) at NDng(/?^) = {7?^,/?^}. 

Now, consider the k-sound Iq := {^(a), r(z),5'(a'),5'(z')}, and I' ■.= lQL\{R{a,b,c),R{a' ,b' ,c')} ob¬ 
tained by two chase steps. The two facts R{a,b,c) and R{a',b' ,c') would be mapped to the same fact 
class D, so we can define E{D) := {{b,c),{b',c'), {b',c),{b,c')}. We can now satisfy Euid on T with¬ 
out violating (j), with two envelope-thrifty chase steps that reuse the remaining tuples (b',c) and {b,c') 
ofE{D). 

The erueial result needed for the Suffieiently Envelope-Saturated Proposition is the following, whieh 
may be of independent interest, and is proved in Appendix F.2 using a eombinatorial eonstruetion. The 
faet that unary keys are problematie is the reason why we handle safe elasses differently. 
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Theorem VII.7 (Dense interpretations). For any set Zpo of FD^ over a relation R with no unary key, 
and ^ G N, there exists a non-empty instance I ofR that satisfies Zpo ond has at least K |dom(/) | facts. 

Henee, we have defined the new notion of n-envelope-saturation, and a saturation proeess to aehieve 
it: the Suffieiently Envelope-Saturated Solutions Proposition. Unlike the Faet-Saturated Solutions 
Lemma, where one faet of eaeh elass was enough, we have shown that envelope-saturated superin- 
stanees may have an arbitrarily high saturation relative to the instanee size. 

VII.2. Envelope-thrifty chase steps 

We ean now introduee envelope-thrifty chase steps: 

Definition VII.8. Envelope-thrifty chase steps are thrifty chase steps (Definition V.8) applicable to 
envelope-saturated aligned superinstances. Let S‘^ be the exported position of the new fact F^, let 
Fw = S{b') be the chase witness, and let D= {S‘^,C) G AFactCI be the fact class ofF,^. We choose some 
remaining tuple t of 8(D) and define br '■= t^for all S’^ G NDng(5'^). 

Reeall from Lemma V.9 that faet-thrifty ehase steps apply to faet-saturated aligned superinstanees, 
and never violate Eufd- Similarly, envelope-thrifty ehase steps apply to envelope-saturated aligned 
superinstanees, and never violate 

Lemma VII.9. For n > 0, for any n-envelope-saturated aligned superinstance J that satisfies Zfd. 
the result J' of an envelope-thrifty chase step on J is an (n — \ )-envelope-saturated superinstance that 
satisfies Zfd- 

We now modify the Faet-Thrifty Completion Proposition (Proposition V.IO), generalized without as¬ 
sumption reversible as in the previous seetion, to use envelope-thrifty ehase steps instead of faet-thrifty 
ehase steps. This is possible beeause the ehoiee of reused elements at non-dangerous positions makes 
no differenee in terms of applieable UI Ds, as they already oeeur at the position where they are reused. 
Henee, we ean perform the exaet same proeess as before (exeept the non-dangerous reuses), using 
Lemma VII.9 to justify that Zfd is preserved; but we must abort if we reaeh an envelope-exhausted 
instanee: 

Proposition VII.IO (Envelope-thrifty eompletion). For any envelope-saturated aligned superinstance 
J of Iq that satisfies Zfd. we can obtain by envelope-thrifty chase steps an aligned superinstance J' of 
1 q, such that J' is either envelope-exhausted or satisfies Z. 

The last problem to address is exhaustion. Unlike faet-saturation, envelope-saturation “runs out”; 
whenever we use a remaining tuple t in a ehase step to ereate F^ and obtain a new aligned superin- 
stanee J', then we eannot use t again in J'. So we must start with a suffieiently envelope-saturated 
superinstanee, and we must eontrol how many ehase steps are applied in the envelope-thrifty eomple¬ 
tion proeess. From the details of our eonstruetion, we ean show the following: 

Lemma VII.11 (Envelope blowup). There exists fi G N depending only on k and Zu ^uch that, for any 
aligned superinstance J = (/,sim) of fi, and global envelope 8, letting J' = (/',sim') be the result of 
the envelope-thrifty completion process, we have |/'| < B|/|. 

We ean now eonelude the proof of the Aeyelie Universal Models Theorem (Theorem III. 6) that we 
stated at the beginning of this seetion. Start by applying the saturation proeess of the Sufficiently 
Envelope-Saturated Solutions Proposition to obtain an aligned superinstanee J = (/,sim) of some k- 
sound /q, such that J satisfies Zfd and is (B |/|)-envelope-saturated. Now, apply the Envelope-Thrifty 
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Completion Proposition to obtain an aligned superinstanee J' of /q. By the Envelope Blowup Lemma, 
J' eontains < B|/| new faets, so, by Lemma VII.9, J' must still be 1-envelope-saturated. Henee, J' 
satisfies Z. This eoneludes the proof, as J' is an aligned superinstanee of /q. 


VIII. Cyclic Queries 

We now finally eomplefe our proof of the Universal Models Theorem (Theorem III.5) by moving from 
aeyelie Boolean CQs to arbitrary Boolean CQs. We do so by a generie proeess whieh is essentially 
independent from our previous eonstruetion. 

Intuitively, the only eyelie CQs that hold in Chase(/o,ruiD) either have an aeyelie self-homomorphie 
mateh (so they are implied by an aeyelie CQ that also holds) or have all eyeles matehed to elements 
of /q. Henee, in a fc-sound instanee for CQ, no other eyelie queries must be true. We ensure this by a 
eyele blowup proeess that takes the produet of our I with a group of high girth, following Otto [14]. 
However, we need to adjust this eonstruetion to avoid ereating FD violations. 

We let 7f = (/f,sim) be the aligned superinstanee obtained from the Aeyelie Universal Models The¬ 
orem (Theorem VII. 1). Its underlying instanee If is a finite superinstanee of /q that satisfies Z, and the 
k-bounded simulation sim guarantees that If is k-sound for ACQ. Our goal in this seetion is to make 
If k-sound for CQ while still satisfying Z, so that it is k-universal. This will eonelude the proof of the 
Universal Models Theorem (Theorem III.5). 

VIII.1. Simple product 

Let us first introduee preliminary notions: 

Definition VIII.l. A group G = (5, •) over a finite set S consists of an associative product law ■ :S^ ^S, 
a neutral element e € S, and an inverse law • ^^ : S ^ S such that x ■ ■ x = e for all x ^ S. We 

say that G is generated by X C S if all elements of S can be written as a product of elements ofX and 
A-i \ xeX}. 

Given a group G generated by X, the girth ofG under X is the length of the shortest non-empty word 
w of elements ofX and X^^ such that w\ - ■ - Wn = e and wt for all \ <i <n. (IfX = {g} with 

g = the girth is 1.) 

Lemma VIII.2 ([12]). For all n and finite non-empty set X, there is a finite group G = (S,-) 
generated by X with girth > n under X. We call G an n-acyclic group generated by X. 

In other words, in an u-aeyelie group generated by X, there is no short produet of elements of X and 
their inverses whieh evaluates to e, exeept those that inelude a faetor xx^^. 

We now take the produet of If with sueh a finite group G. This ensures that any eyeles in the produet 
instanee are large, beeause they projeet to eyeles in G. We use a speeifie generator: 

Definition VIII.3. The fact labels of a superinstanee I offi are A(/) := {if | F G /\/o, 1</<|L|}. 

Now, we define fhe produef of a superinsfanee I of /q with a group generated by A(/). We make sure 
not to blow up eyeles in Iq, so the result remains a superinstanee of fi: 

Definition VIII.4. Let I be a finite superinstanee offi and G be a finite group generated by A(/). The 
product of I by G preserving Iq is the finite instance (/,/o) G with domain dom(/) x G consisting of 
the following facts, for all g G G: 

• For every fact R{a) of Iq, the fact R{{aug),... ,{aiKi,g)). 
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• For every fact F =R{a) o//\/o, the following fact: 

We identify {a,e) to afar a G dom(/o), so (/,/o) <8>G is still a superinstance oflo. 

We say a superinstance I of Iq is fc-instance-sound (for Z) if for any CQ q such that |^| <k, if q 
has a match in 1 involving an element of Iq, then Chase(/o,ZuiD) |= q- We can ensure that f is k- 
instance-sound, up to having performed k chase rounds on Iq initially. We can then state the following 
property: 

Lemma VIII.5 (Simple product). Let I be a finite superinstance of Iq and G a finite {2k + \)-acyclic 
group generated by A(/). If I is k-sound for ACQ and k-instance-sound, then {I,Iq) <8> G is k-sound for 

CQ. 

Example VIII.6. Consider Fq ■=R{a,b), Iq ■= {Eq}, and Zuid consisting of T : R^ C F :S^Q R\ 
and (tO ^ S{b,a), and / := /q U {F}. I satisfies Zuid and is sound for ACQ, but not for 

CQ: take for instance q : 3vy R{x,y) A5'(y,x), which is cyclic and holds in I while (/o,Luid) ^unr q- 
We have A{I) = {if ,1^ }. Identify if and if to 1 and 2 and consider the group G ■= ({0, 1,2},-) 
where ■ is addition modulo 3. G has girth 2 under A(/). 

The product Ip ■■= {I,Iq)®G, writing pairs as subscripts for brevity, is {R{aQ,bQ) ,R{a\ ,b\) ,R{a 2 ,b 2 ), 
S{b\,a 2 )-,S{b 2 ,aQ),S{bQ,a\)}. In this case Ip happens to be 5-sound for CQ. 

We cannot conclude directly with the simple product, because Ip := (/f,/o) ^G may violate Zufd 
even though f \= Zpo- Indeed, there may be a relation /?, a UFD ^ in Zufd. and two /?-facts 

F and F' in If\lQ with 7 tRP^Rq{F) = 7 tRP^Rq{F'). In Ip the images of F and F' may overlap only on RP, so 
they could violate 0. 

VIII.2. Mixed product 

What we need is a more refined notion of product, that does not attempt to blow up cycles within fact 
overlaps. To define if, we need fo consider a quotient of Ip 

Definition VIII.7. The quotient I/r^ of an instance I by an equivalence relation ~ on dom(/) is defined 
as follows: 

• dom(//~) is the equivalence classes of ^ on dom(/), 

• //~ contains one fact R{A) for every fact R{a) of I, where A,- is the ^-class of at for all R‘ G 
Pos{R). 

The quotient homomorphism is the homomorphism from I to Idefined accordingly. 

We quofienf f by fhe equivalence relation (recall Definition V.l), yielding l[ ■■= Iil'::i]c- The 
resulting may no longer salisfy Z. However, if is still ^-sound for ACQ, for fhe following reason: 

Lemma VIII.8. Any k-bounded simulation from an instance I to an instance F defines a k-bounded 
simulation from I/—k to T. 

We fhen consider fhe homomorphism from f fo !{, and blow up cycles in f by a mixed product 
fhaf only distinguishes facls wifh a differenl image in l[ by x^k- The poinf is fhaf, as we show from 
our consfrucfion, facls of f fhaf have fhe same elemenls al fhe same posilions always have fhe same 
~i:-class. Hence, fhey are mapped fo fhe same facl by X'^^u "'ill 1*® dislinguished by fhe mixed 
producl. Lei us formalize Ihis: 
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Definition VIII.9. Let 1 be a superinstance o/Iq and hbe a homomorphism from I to some instance I'. 
We say I is cautious for h (and Iq) if for any relation R, for any two R-facts F and F' such that 
ttRp{F) = Krp{F') for some R^ € Pos(/?), either F,F' G Iq, or h{F) = h{F'). 

Lemma VIII.10 (Cautiousness). The superinstance If o//o constructed by the Acyclic Universal Mod¬ 
els Theorem (Theorem VII. I) is cautious for x^i,- 

The reason why If is cautious for h: = ’^^at, except for facts of Iq, overlaps between facts only 

occur when reusing envelope elements at non-dangerous positions, in which case the sim-images of 
both facts are ~^-equivalent in Chase(/o,LuiD)- We can then show that, from our construction, such 
elements are actually ^^.-equivalent in If. 

We now define the notion of mixed product, which uses the same fact label for facts with the same 
image by h\ 

Definition VIII.11. Let I be a finite superinstance of Iq with a homomorphism h to another finite 
superinstance I' of Iq such that h^j^ is the identity and maps to I'\Iq. Let G be a finite group 

generated by A(/'). 

The mixed product of I by G via h preserving Iq, written (/,/o) is the finite superinstance o/Zq 
with domain dom(/) x G consisting of the following facts, for every g G G: 

• For every fact R(a) of Iq, the fact R({aug),... ,{aiRi,g)). 

• For every fact R(a) of I\Iq, the following fact: 

We now show that the mixed product preserves UIDs and FDs when cautiousness is assumed. 

Lemma Vlll.f 2 (Mixed product preservation). For any UID or FD T, if I \= T and I is cautious for h, 
then {I,Iq) G\='C. 

Second, we show that h : I ^ T lifts to a homomorphism from the mixed product to the simple 
product. 

Lemma V11L13 (Mixed product homomorphism). There is a homomorphism from (/,/o) ®^Gto {I'fo) ® 
G which is the identity on Iq x G. 

We can now conclude our proof of the Universal Models Theorem (Theorem III.5). We construct 
Jf = (/f,sim) by the Acyclic Universal Models Theorem (Theorem VII. 1) and consider If. It is a finite 
superinstance of Iq which is ^-universal for £ and ACQ. Further, up to having distinguished the elements 
of Iq with fresh predicates and having performed initial chasing, we can ensure that 7^ := is k- 
instance-sound and that the homomorphism Xc^t If satisfies fhe hypofheses of fhe mixed producf. 

Lef G be a (2k + l)-acyclic group generafed by A(7f), and consider Ip := (7f,7o) <8) G. As 7f was 
^-sound for ACQ, so is I( by Lemma VIII.8, and as I( is also ^-insfance-sound. Ip is Usound for CQ by 
fhe Simple Producf Lemma (Lemma VIII.5). However, as we explained, in general Ip ^ £. We fhus 
consfrucf := (7f,/o) G, wifh h := x^^k- Mixed Producf Homomorphism Lemma, 7m has a 

homomorphism fo Ip, so if is also ^-sound for CQ. Furfher, If is caufious for x~k Cautiousness 

Lemma, so, by fhe Mixed Producf Preservation Lemma, we have 7m |= £ because If ^ £. 

Hence, fhe mixed producf 7m is a finife /:-universal insfance for £ and CQ. This concludes fhe proof 
of fhe Universal Models Theorem, and hence of our main fheorem (Theorem III.2). 
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IX. Conclusion 


In this work we have developed the first teehniques on arbitrary arity sehemas to build finite models that 
satisfy both referential eonstraints and number restrietions, while eontrolling whieh CQs are satisfied. 
We have used fhis fo prove fhaf finife open-world query answering for CQs, UIDs and FDs is finifely 
eonfrollable up fo finife elosure of fhe dependeneies. Using fhis, we have isolafed fhe eomplexify of 
FQA for UIDs and FDs. 

As presenfed fhe eonsfruefions are quife speeifie fo dependeneies, buf in fufure work we will look 
fo extend fhem fo eonsfrainf languages eonfaining disjunefion, wifh fhe goal of generalizing fo higher 
arify fhe rieh arily-2 eonsfrainf languages of, e.g., [10, 15], while mainfaining fhe deeidabilify of FQA. 
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A. Details about the UID chase and Unique Witness Property 

Recall fhe Unique Witness Property: 

For any elemenf a G dom(Chase(/,ruiD)) and posifion RP of a, if fwo facfs of Chase(/,ruiD) con- 
fain a af posifion RP, fhen fhey are bofh facfs of 7. 

We firsf exemplify why fhis may nol be guaranteed by fhe firsl round of fhe UID chase. Consider fhe 
insfance I = {7?(a),5'(a)} and fhe UIDs ty : /?' C and ty • , where T is binary. Applying a 

round of fhe UID chase creates fhe insfance {R{a) ,S{a) ,T {a,b\) ,T {a,b 2 )}, wifh T{a,b\) being created 
by applying ty fo fhe acfive facf R{a), and T{a,b 2 ) being created by applying ty to the active fact S{a). 

By contrast, the core chase would create only one of these two facts, because it would consider that 
two new facts are equivalent: they have the same exported element occurring at the same position. In 
general, the core chase keeps only one fact within each class of equivalent facts. 

However, after one chase round by the core chase, there is no longer any distinction between the UID 
chase and the core chase, because the following property holds on the result I' of a chase round (by the 
core chase or the UID chase) on any instance 1": (*) for any T G Tuid and element a G Wants(7', t), a 
occurs in only one fact of This is true because Tuid is transitively closed, so we know that no UID 
of ZuiD is applicable to an element of dom(7") in hence the only elements that witness violations 
occur in the one fact where they were introduced in 

We now claim that (*) implies the Unique Witness Property. Indeed, assume to the contrary that 
a G dom(Chase(7,ruiD)) violates it. 

If a G dom(7), because Tuid is transitively closed, after the first chase round on 7, we no longer create 
any fact that involves a. Hence, each one of Ei and E 2 is either a fact of 7 or a fact created in the first 
round of the chase (which is a chase round by the core chase). However, if one of Ei and E 2 is in 7, 
then it witnesses that we could not have a G Wants(7,7?^), so it is not possible that the other fact was 
created in the first chase round. It cannot be the case either that E\ and E 2 were both created in the first 
chase round, by definition of the core chase. Hence, Ei and E 2 are necessarily both facts of 7. 

If a G dom(Chase(7,ruiD))\dom(7), assume that a occurs at position RP in two facts E\, E 2 . As 
a dom(7), none of them is a fact of 7. We then show a contradiction. It is not possible that one of 
those facts was created in a chase round before the other, as otherwise the second created fact could 
not have been created because of the first created fact. Hence, both facts must have been created in 
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the same ehase round. So there was a ehase round from /" to /' where we had a G Wants(/",/?^) and 
both Fi and F 2 were ereated respeetively from aetive faets F[ and F 2 of /" by UI Ds Ti : C RP and 
T 2 : r'' C Byt then, by property (*), a oeeurs in only one faet, so as it oeeurs in F[ and F 2 we have 
Fl = F 2 . Further, as a ^ dom(/), F[ and F 2 are not faets of I either, so by definition of the UID ehase 
and of the eore ehase, it is easy to see a oeeurs at only one position in F^ = F^. This implies that Ti = T 2 - 
Henee, we must have Fi = F 2 . 

B. Proofs for Section III: Main Result and Overall Approach 

B.l. Proof of Proposition lll.l (Complexity of UQA for FDs and UIDs) 

Proposition III.l. UQA for FD 5 and UID 5 has PTIME data complexity and NP-complete combined 
complexity. 

We first show the results for UIDs in isolation. UQA for UIDs is NP-eomplete in eombined eom- 
plexity: the lower bound is immediate from query evaluation [1], the upper bound is by Johnson & 
King [11] and aetually holds for IDs of arbitrary fixed arify (whieh fhey eall “widfh”). For dafa eom- 
plexify, Cali el al. [6] showed a PTIME (in fael, AC**) upper bound for arbifrary IDs by observing fhal 
fhe eerlain answers ean be expressed by anofher firsl-order query. 

We now show lhaf fhe same upper bounds apply lo UQA for UIDs and FDs (fhe lower bound elearly 
also applies). This resulf is implieil in prior work of [5, 4], bul we prove if here for eompleleness. We 
argue lhaf UIDs and FDs are separable This means fhal for any eonjunelion £ of FDs Zpo and UIDs 
^uiD, for any inslanee 1q and CQ q, if Iq \= Zfd then we have (/o,£) |=unr q ^ (^Oj^uid) |=unr q- From 
Ibis resulf, fhe upper bounds follow from fhe bounds for fhe UID ease above, sinee eheeking whefher 
fo 1= ZpD can be done in PTIME. Separabilify follows from fhe non-conflicting condition of [5, 4] bul 
we give a simpler argumenl. 

Assume lhaf /q salisfies Zpo- Clearly if (/o,Suid) Nunr q then (/o,S) ^unr q- We Ihus need lo 
show lhaf if (/q,!) ^unr q then (/o,I:uid) |=unr q- Consider Chase(/o,i:uiD)- If Chase(/o,ruiD) |= ^fd, 
Ihen Chase(/o,£uiD) is a superinsfanee of /q thal salisfies £, so beeause (/o,£) ^unr q we musl have 
Chase(/o,£uiD) |= q- By universality of fhe ehase, Ihis implies (/o,Suid) l=unr q- 
Henee, if suffiees lo show fhal Chase(/o,£uiD) \= ^fd- Assume lo Ihe eonlrary Ihe exislenee of F 
and F' in Chase(/o,£uiD) violaling an FD of ZpD- There musl exisl a posilion RP G Pos(a) sueh fhal 
ttRp{F) = nRp{F'). By Ihe Unique Wilness Property, Ihis implies lhaf F and F' are faels of /q, whieh is 
impossible by our assumplion lhal /q |= £fd- 

B.2. Proof of the Main Theorem (Theorem III.2) from the Universal Models 
Theorem (Theorem III.5) 

To show Ihe Main Theorem from Ihe Universal Models Theorem, lei £ be a eonjunelion of FDs and 
UI Ds, £' ils finite elosure, and /q a finite inslanee. We wanl lo show finite eonlrollabilily up lo finite 
elosure, namely, (/o,£) ^fin q iff {hX) Nunr q- 

We ean assume wilhoul loss of generality lhal 1q salisfies Ihe FDs of £', as olherwise Ihere is no 
superinsfanee of /q salisfying £', and bolh problems are always vaeuously Irue. 

If is elear lhal for any CQ q, we have |=fin q iff q. Indeed, £' ineludes £ and 

eonversely any finite superinsfanee of Iq whieh salisfies £ musl salisfy £', by definilion of Ihe finite 
elosure. So in fael, lo prove finite eonlrollabilily up lo finite elosure, il suffiees lo show lhal (/o,£0 Nfin 
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q iff {Iq,'^') ^unr q for any CQ q. The backward implication is immediate as all finite superinstances 
of /q satisfying Z' are also unrestricted superinstances. We prove the contrapositive of the forward 
implication. 

Let ^7 be a CQ, let k ■■= \q\, and assume that (/o,Z') ^unr q- By the Universal Models Theorem, let 1 
be a finite superinstance of Iq that is |^|-sound and satisfies As I is |^|-sound, we have q, so, as 
7 is a finife superinsfance of /q fhaf satisfies if wifnesses fhaf (7o,Z') q. This proves fhe desired 
equivalence. Hence, we have esfablished fhaf Z' is finitely confrollable up fo finife closure, and have 
proved fhe Main Theorem. 

B.3. Proof of Corollary III.3 (Complexity of FQA for FDs and UIDs) 

Corollary III.3. FQA for FD^ and UID 5 has PTIME data complexity and NP-complete combined 
complexity. 

By our Main Theorem (Theorem III.2), any insfance {f'L,q) fo fhe FQA problem, formed of an 
insfance 7, a conjuncfion Z of IDs Zuid and FDs Zfd> and a CQ q, reduces fo fhe UQA insfance 
(7,Z',^), where Z' is fhe finife closure of Z. Computing Z' from Z is dafa-independenf, so fhe PTIME 
dafa complexify resulf of Proposifion III.l clearly sfill applies. If is also clear fhaf fhe NP-hardness 
combined complexify bound of Proposifion III. 1 can be re-proven for FQA, as if already held even 
when Z = 0. So we only need fo show fhaf fhe combined complexify of FQA is in NP. A naive approach 
would be fo compute explicifly Z' and solve fhe UQA insfance 7, Z', q-, buf maferializing Z' may fake 
exponenfial time. 

Instead, remember that from our study of UQA complexity in the proof of Proposition III. 1, UQA for 
UIDs and FDs can be performed by first checking the FDs on the initial instance, and then performing 
UQA for the UIDs in isolation. Hence, let Z[jjq and Zp^ be the UIDs and FDs of Z'. Rather than materi¬ 
alizing Z', we will show that we can decide whether 7 \= Zpp, in PTIME, and compute Z^jq in PTIME, 
which suffices to prove the claim as the combined complexity of deciding whether (7,Zuip)) |=unr q is 
then in NP. 

We first justify that we can indeed compute ZjjjQ in PTIME. We consider every possible UID on 
positions occurring in Z (there are polynomially many), and for each of them, determine in PTIME 
from Z whether it is in Z', using the implication procedure of Cosmadakis et al. [8]. This allows us to 
compute Zjjjp, in PTIME. 

We next justify that we can decide whether 7 \= Zp^ in PTIME. For the same reason as for the UIDs, 
we can compute in PTIME from Z the set Zjjpp, of the UFDs which are in Z', by deciding implication 
for each possible UFD. We now argue that to test whether 7 \= Zpp,, it suffices fo fesf whefher 7 \= Zpo 
and whefher 7 |= Z(jpp,. This follows if we can show fhaf Zpp, is implied by Zjjpp, U Zfd by fhe usual 
axiomafizafion of unresfricfed and finife implicafion for FDs alone, from Armsfrong [2]. Indeed, in fhis 
case, if 7 1= Zpp, fhen 7 \= Zjjpp, U Zfd as if is a subsef of Zpp,, and conversely if 7 |= Zjjpp, U Zfd then 7 
satisfies Zpp, because fhey are implied by Z^pp, U Zfd so are also salisfied by any insfance fhaf safisfies 
^UFD ^FD- 

To justify fhaf Zpp, is implied by Zjjpp, UZfd, we use Theorem 4.1 of [8], according fo which a sound 
and complefe axiomafizafion of fhe finife closure of FDs and UIDs consisfs of fhe usual FD implicafion 
rules, fhe sfandard UID axiomafizafion of Casanova ef al. [7], and fhe cycle rule. So, consider any FD 
<j) of Zpp, and lef us justify fhaf if is implied by Z^pp, UZfd- If ^ is a UFD, fhen <j) G Zpp,. Ofherwise 
fhe lasf steps of a derivafion of <j) wifh fhe axiomafizafion of [8] musf be rules from fhe FD implicafion 
rules, as fhey are fhe only ones which can deduce higher-arify FDs. Eef us group fogefher fhe lasf FD 
implicafion rules fhaf were applied, and consider fhe sef S of fhe hypofheses fo FD implication rules 


24 



that were not themselves produeed by FD implieation rules. Eaeh hypothesis from S is either an FD 
of Zfd or was produeed by the eyele rule. Now, the eyele rule ean only deduee UFDs (and UIDs). 
Henee, S C ZpD UZ^pQ, whieh implies that we ean eonstruet a derivation of 0 from ZpD UZ^pj-, using 
the FD implieation rules. Thus, we ean indeed eompute in PTIME Z[jpp) UZpD, and eheek in PTIME 
whether I \= Z[jpp, UZfd> and we have shown that this is equivalent to eheeking whether I ^ Z^. This 
eoneludes the proof. 

C. Proofs for Section IV: Weak-Soundness and Reversible UIDs 

This seetion proves the Aeyelie Unary Weakly-Sound Models Proposition (Proposition IV.2), whieh 
weakens the Aeyelie Unary Models Theorem (Theorem III.6) by making assumption reversible and 
replaeing k-soundness by weak-soundness (Definition IV. 1). 

C.l. Proof of Proposition IV.4 (Satisfying UIDs in balanced instances) 

Proposition IV.4. Assuming binary and reversible, any balanced finite instance I satisfying Zufd has 
a finite weakly-sound superinstance I' that satisfies Zy, with dom(/') = dom(/). 

Eor every relation R of a, let /r be a bijeetion between Wants(/,/?^) and Wants(/,/?^); this is 
possible, beeause I is balaneed. 

Consider the superinstanee I' of I, with dom(/') = dom(/), obtained by adding, for every R of a, 
the faet R{a,fR{a)) for every a € Wants(/,/?^). /' is elearly a finite weakly-sound superinstanee of I, 
beeause for every a G dom(/'), if a oeeurs at some position RP in some faet F of then either F is a 
faet of I and a G 71rp{I), or F is a new faet and by definition a G Wants(/,F^). 

Eet us show that F |= Zufd- Assume to the eontrary that there are two faets F and F' in F that 
witness a violation of a UFD 0 : of Zufd- As 7 |= Zufd> one of F and F' is neeessarily a 

new faet; we assume without loss of generality that it is F. Consider a ■= nRp{F). By definition of 
the new faets, we have a G Wants(7,FP), so that a ^ 71rp{I). Now, as {F,F'} is a violation, we must 
have 71rp{F) = nRp{F'), so as a ^ tlRp{I), F' must also be a new faet. Henee, by definition of the new 
faets, letting b := TIr^F) and b' ■■= TtRfiF'), depending on whether p = 1 or p = 2 we have either 
b = b' = fR{a) or b = b' = (a), whieh is well-defined beeause /r is a bijeetion. This eontradiets the 

faet that F and F' violate (p. 

Eet us now show that F \= Zuid- Assume to the eontrary that there is an aetive faet F = R{ai,a 2 ), 
for a UID T : F^ C S‘^. If F is a faet of 7, we had ap G Wants(7,5'^), so F eannot be an aetive faet in 
F by eonstruetion of /j. So we must have F G 7'\7. Henee, by definition of the new faets, we had 
ap G Wants(7,F^); so there must be t' : F'' C RP in Zuid sueh that ap G 717^(7). Henee, beeause Zuid 
is transitively elosed, either F'' = 5^ or the UID F'' C is in Zuid- In the first ease, as ap G Tlrfl), F 
eannot be an aetive faet for T, a eontradietion. In the seeond ease, we had ap G Wants(7,5'^), whieh is 
a eontradietion for the same reason as before. 

Henee, F is a finite weakly-sound superinstanee of 7 that satisfies Zu and with dom(7') = dom(7), 
the desired elaim. 

C.2. Proof of the Balancing Lemma (Lemma IV.9) 

Lemma IV.9 (Balaneing). For any finite instance 7, if I satisfies Zufd then it has a balanced pssin- 
stance. 
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We prove the lemma without assumption binary, as we will use it without this assumption later in 
Seetion IV. 

For any position RP define o{RP) ■■= Wants(/,/?^) U 71rp{I). Intuitively, those are the elements that 
either appear at RP or want to appear there. We elaim that o{RP) =o{S‘^) whenever ~id 5^. Indeed, 
we have 71rp{I) C o{S^)\ elements in 71rp{I) want to appear at S‘^ unless they already do, and in both 
eases they are in o{S^). Likewise, elements of Wants(/,/?P) either oeeur at S‘^, or at some other position 
T’’ sueh that T'' C is a UID of Luid, so that by transitivity T'' C S‘^ also is, and so they want to be at 
5^ unless they already are. Henee o{RP) V o( 5'^), and symmetrieally o{S‘^) C o{RP). 

Let N := max/jpgpo5(fy) |, whieh is finite. We write [/?^]id the ~iD-elass of any position RP. We 
define for eaeh ~iD-class a set p([^^]id) of V— |o(/?^)| fresh values. We let % be the disjoint 

union of the p([/?p]id) for all elasses and set A to map the elements of p([/?^]id) to [/?^]id- We 

have thus defined our pssinstanee P = 

Let us now show that P is balaneed. Consider now two positions RP and sueh that ^ -.RP ^R^ and 
0' : 7?^ —> RP are in Lufd, and show that |Wants(/’,/?^)| = |Wants(F,/?^)|. We have |Wants(/’,/?^)| = 
\\Nants{I,RP)\ + |p([7?^]id)| = |o(7?^)| — \nRp{I)\ +N — |o(/?^)|, whieh simplifies to V— |7ri?p(/)|. Simi¬ 
larly |Wants(F,/?^)| = N — \ nRq{I)\. Sinee I \= Lufd and <j) and (j)' are in Lufd we know that \nRp{I) \ = 
\nRq{I)\. From this the eonelusion follows. 

C.3. Proof of the Binary Realizations Lemma (Lemma IV.10) 

Lemma IV.IO (Binary realizations). For any balanced pssinstanee P of an instance I that satisfies 
^UFD. we can construct a realization of P that satisfies Zu- 

Let us eonstruet a realization I' of P. We eonstruet bijeetions fR for every relation R between 
\Nants(P,R^) and Wants(F,/?^) as for Proposition IV.4; this is possible, as P is balaneed. We then 
eonstruet 1' in the same way, by adding to /, for every R of a, the faet R{a,fR{a)) for every a € 
Wants(P,/?i). 

We prove that /' is a realization again by observing that whenever we ereate a faet /?(a,//}(a)), then 
we have a G Wants(P,/?^) and /^(a) G Wants(P,/?^). 

The faet that 1' satisfies Zufd is for the same reason as for Proposition IV.4. 

We now show that /' satisfies Zuid- Assume fo the eontrary that there is an aetive faet F = R{ai,a 2 ), 
for a UID T : RP F 5^, so that ap G Wants(/',/?^). If ap G dom(/), then the proof is exaetly as for 
Proposition IV.4. Otherwise, if ap G PL, elearly by eonstruetion of fR and F we have ap G nTfil') iff 
T’’ G X{ap). Henee, as ap G 71rp{P) and as T witnesses by assumption reversible that RP ~id S‘^ , we 
have ap G %<?(/'), eontradieting the faet that ap G Wants(/',5''?). 

C.4. Proof of Lemma “Binary realizations are completions” (Lemma IV.11) 

Lemma IV.ll (Binary realizations are eompletions). Iff is a realization of a pssinstanee of I then it 
is a weakly-sound superinstance of I. 

Clearly F is a superinstanee of 7. Let us show that it is weakly-sound. Reeall the definition of a 
weakly-sound superinstanee: 

Definition IV.l. A superinstanee F of an instance I is weakly-sound if the following holds: 

• for any a G dom(7) and RP G Pos(a), if a G nRp{F), then either a G TlRpf) ora€ Wants(7,7?^); 

• for any a G dom(7')\dom(7) and RP,S‘i G Pos(a), if a G 71rp{F) and a G 7lsq{F) then RP = S‘> or 
RP C 5^ is in Zuid- 
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Consider a G dom(/') and RP G Pos(a) such that a G nRp{l'). As I' is a realization, we know that ei¬ 
ther a G TIrp{I) or a G Wants(/’,/?^). By definition of Wantsand because % = dom(/')\dom(/), 
this means that either a G dom(/) and a G UWants(/,/?P), or a G dom(/')\dom(/) and/?^ G A(a). 

Hence: 

• For any a G dom(/) and RP G Pos(a), we have established that a G 71rp{I') implied that either 
a G 71rp{I) or a G Wants(/,/?P). 

• For any a G dom(/')\dom(/) and for any RP,S‘^ G Pos(a), we know that RP,S‘^ G A(a), which 
implies that ~id 5^, so 7?^ = 5^ or 7?^ C 5 ^ is in Zuid- 

So indeed the two conditions of weak-soundness hold. 

C.5. Proof of the Realizations Lemma (Lemma IV.16) 

Lemma IV.16 (Realizations). For any balanced pssinstance P of an instance I that satisfies Lufd, we 
can construct a 'L\]-compliant piecewise realization of P. 

Let F = (7,77, A) be the balanced pssinstance. Recall that the G-^puN-dasses of a are numbered 
ni,...,n„. By definition of being balanced (Definition IV.3), for any GGpuN-dass IT,, for any two 
positions RP,R^ G IT,-, we have |Wants(F,7?^)| = |Wants(F,7?^)|. Hence, for all 1 < / < n, let Si be 
the value of |Wants(F,77^)1 for any RP G H,. For 1 < / < n, we let m, be the arity of H,-, and number 
the positions of H, as Rp'^ ,RP"'i. We define for each 1 < / < n and 1 < y < m, a bijecfion from 

{1,... , 5 ,} fo Wants(F,7?^^). We consfrucf fhe piecewise realization PI = {Ki,... ,Kn) by setting each 
Ki for 1 < / < n to be Tin, (7) plus the tuples (1),..., 0 ^. (/)) for 1 < Z < Si. 

It is clear that PI is indeed a piecewise realization, because whenever we create a tuple a G H, for 
any 1 < Z < n, then, for any RP G n,, we have ap G Wants(F,7?^). 

Let us then show that PI is ZuFD-compliant. Assume by contradiction that there is 1 < Z < n and 
a,b G Ki such that a/ = bi but 7 ^ br for some 7?^,7?'' G H,. As 7 satisfies Lufd, we assume wifhouf loss 
of generalify fhaf a G Ki\7tTli{I). Now eifher b G 7 rn;( 7 ) or i G Ki\7tHi{I). 

Ifb G 7 rn;( 7 ), fhen we know fhaf bi G 71ri{I), buf we know by consfrucfion fhaf, as a G Ki\nIli{I), 
we have a/ G Wants(F,7?^). Now, as a/ = bi and bi G dom(7), we have a/ G dom(7), so fhaf by definilion 
of Wants(F,7?^) we have ai G Wants(7,7?^). Thus, as a/ = bu we have a confradicfion. 

Now, if A G Ki\itYli{I), fhen, wrifing R^ = RPj and 7?'' = R^i', fhe facl fhaf a/ = bi buf Or 7 ^ br 
confradicfs fhe facl fhaf <pj o (^j,)“^ is injeclive. Hence, PI is ZuFD-complianl. 

Lei us now show fhaf PI is ZuiD-complianl. 

We musl show fhaf, for every UID T : 7?^ C 5^ of Zuid. we have Wants(F7, t) = 0, which means fhaf 
we have nRp{PI) C ^^(F/). Lef H, be fhe -H-puN-dass of RP, and assume fo fhe conlrary fhe existence 
of a fuple a of Ki such fhaf ap ^ %?(F7). Eifher we have ap G dom(7), or we have ap G 77. 

In fhe firsl case, as ap ^ TtspiPI), in particular ap ^ %‘?(7), and as ap G 71rp{I), we have ap G 
Wants(7, t), so ap G Wants(7,5'^). By consfrucfion of PI, fhen, telling i' be fhe GGpuN-dass of 5^ and 

lelling 5^ = SPj , as is surjeclive, we musl have ap G TtsfiKi'), fhaf is, ap G KsfiPI), a confradicfion. 

In fhe second case, clearly by consfrucfion we have ap G 7tT'\Pl) iff L'' G A(ap), so fhaf, given fhaf 
T wilnesses RP ~id 5^, if ap G nRp{PI) fhen ap G %?(F7), a confradicfion. 

We deduce fhaf PI is indeed a Su -complianl piecewise realizalion of F, completing fhe proof. 
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C.6. Proof of the Relation-Saturated Solutions Lemma (Lemma IV.20) 

Lemma IV.20 (Relation-saturated solutions). The result of performing sufficiently many chase rounds 
on any instance I is relation-saturated. 

Reeall the definition of an instanee being relation-saturated: 

Definition IV.18. A relation R is achieved (by I and Luid) if there is some R-fact in Chase(/,ZuiD)- 
A superinstance I' of an instance I is relation-saturated (for Luid) if every achieved relation (by 1 
and Luid ) occurs in I'. 

We now prove the lemma. For every relation R, either R is not aehieved by I and Luid, or there 
is Ur G N sueh that there is a R-faet of Chase(/,ruiD) generated at the UR-th round of the ehase. Let 
n := maxsgfjUR. As the number of relations in a is finite, n is finite. Henee, letting I' be the result of 
applying n ehase rounds to /, it is elear that /' is relation-saturated. 

C.7. Proof of Lemma “Using realizations to get completions” (Lemma IV.21) 

Lemma IV.21 (Using realizations to get eompletions). For any finite relation-saturated instance I that 
satisfies l.uFD,ffom a -compliant piecewise realization PI of a pssinstance of I, we can construct a 
finite weakly-sound superinstance of 1 that satisfies Zu- 

Reeall that we number ITi,... ,n„ the OFUN-dasses of Pos(a). We first define the following notion: 

Definition C,l, We say that fly is an inner ^^px^^-class if it contains a position occurring in Zuid/ 
otherwise, it is an outer ^puf^-class. 

Intuitively, “outer” -f-^FUN-dasses are those to whieh no UID of Zuid can apply, so we ean ereate 
fresh elements at the positions of these elasses without fear that UI Ds will be applieable to the fresh 
elements. 

We will use the notion of dangerous and non-dangerous positions from Seetion V: 

Definition V.7. We say a position G Pos(a) is dangerous for a position S‘^ S'' if S'' S‘^ is 

in ZuFD. >^nd write S'' G Dng(5'^). Otherwise, S'' is non-dangerous, written S' G NDng(5'^). Note that 
{5«} U Dng(5'') U NDng(5«) = Pos(S). 

Observe that, if RP gGfun Rf then for R' {RP,R‘^}, we have R' G Dng{RP) iff R' G Dng(R''), and 
likewise for NDng(/?t’) and NDng(R^). So it makes sense to define Dng(n;) or NDng(n;), for IT,- an 
OpuN-dass of positions of some relation R, to refer to the positions of Pos(R)\n, that are dangerous 
or non-dangerous for some RP G fT, (and henee for all of them). 

We show a first lemma about the positions where FD violations may be introdueed: 

Lemma C.2, For any relation R and FD^ Zpo. far any RP G Pos(/?) and UFD o/Zpo. If 

R'i G NDng(RP) then R' G NDng(RP). 

Proof Assume by eontradietion that R' ^ NDng(Rt’). Then either R' = RP or R' G Dng(RP). The 
first ease is impossible beeause of the UFD > R'. So we have R' G Dng(Rt’). Henee, the UFD 
R' RP is in Zufd> so that by transitivity the UFD R^ ^ RP is in Zpjpo, again eontradieting the faet 
thatR'? G NDng(RP). □ 
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Fix the finite relation-saturated instanee 1 that satisfies Tupd, the pssinsfanee P of 7, and fhe finife 
Zu-complianf pieeewise realization PI = (Tfi,... of P. Our approaeh is fo eonsfruef fhe desired 
superinsfanee 7' as 7U7i U • • • U7„, where fhe faels of eaeh 7,- are eonsfruefed from Ki, as we now explain. 
We eall T fhe sef of fhe fresh elemenfs (nof in dom(F7)) fhaf will be ereafed in fhe eonsfruefion, so fhaf 
we will have dom(7') C dom(F7) U F. 

We eonsider every 1 < / < n. Lef R be fhe relation fo whieh fhe positions of IT; belong. If fhe relafion 
R is nof aehieved by 7 and Tuid, or if IT; is oufer, fhen we do nof ereafe any fael for R, and sef 7; := 0. 
Ofherwise, as 7 is relalion-safurafed, we ehoose one fael R{c) in 7. For every a G 7f,\7rn, (f)> we ereafe 
a fael ■■= R{b) in 7,, wilh bp defined as follows for every RP G Pos(a): 

• If G n„ lake bp ■ = Up. In olher words, fhe luple a is used fo fill b al fhe posilions of IT,. 

• If G Dng(n;), use a fresh elemenl in T for bp. In olher words, dangerous posilions have fo 

be filled wilh fresh elemenfs (bul Ibis is no problem beeause we will show laler fhaf Iheir elasses 
are oufer). 

• If 7?f G NDng(n,) is non-dangerous, lake bp ■= Cp. In olher words, we reuse fhe fael R{c) 
guaranteed by 7 being relalion-saluraled lo eomplele Ihe non-dangerous positions. 

We have Ihus eonsfruefed 7', whieh is elearly a finife superinsfanee of 7. We firsl show fhe following 
elaim: 

Lemma C.3. For any \ <i <n and a G Kifor which we create a fact Ff for any RP G IT,-, the fact F^ 
is the only fact off where ap occurs at position RP. 

This elaim implies fhaf fhe faels of 7, and all fhe faels of fhe 7,- for 1 < / < n, are pairwise disfinel. By 
Ibis, we mean fhaf we did nof fry fo reereale in 7, a fael fhaf already exisled in 7, and fhaf we never fried 
lo ereafe fhe same fael Iwiee in fhe same 7, or in differenl 7,-. 

Proof Fix 1 <i <n and a G 77,■, and assume fhaf we have ereafed a fael F^ ; fix g n,- 

We firsl show fhaf we eannol have ap G 71rp{I). Assuming by eonfradielion fhaf we do, lef F’ be a 
wilnessing fael. By definition of a pieeewise realization we have 7rn, (7) ^ Ki, so 7rn, (7^) £ 77,- Henee, 
as PI is ZFD-complianl, we have a = Tin, (7^); bul we do nof ereafe faels for fhe luple a G 77,- if a G Tin, (I), 
whieh eonlradiels Ihe fael lhal we ereafed Ff 

Seeond, we show lhal Ihere eannol be anolher fael F of f\I sueh lhal ap = nRp{F). As PI is Lufd- 
eomplianl, Ihere elearly eannol be sueh a fael F^, for a' G 77„ a a' , wilh ap oeeurring al position RP of 
F^,. Henee, F’ is a fael F^, for i' / i. Now, IT,- and IT,/ are disjoinl as G->FUN-classes, and Ihus we eannol 
have RP G IT,/. So eilher 7?f g Dng(n,v) and bp G K, orRP G NDng(n,v) and bp G TIrp^I). The firsl ease 
is impossible beeause elemenfs of T oeeur in only one fael, and we showed above lhal Ihe seeond ease 
was impossible. This eoneludes. □ 

We now show lhal f has Ihe required properties. Lei us firsl show lhal f is weakly-sound. Reeall Ihe 
definition: 

Definition IV.l. A superinstance f of an instance I is weakly-sound if the following holds: 

• for any a G dom(7) and RP G Pos(a), if a G nRp{f), then either a G 71rp{I) or a € Wants(7,7?t’); 

• for any a G dom(7')\dom(7) and RP,S‘i G Pos(a), if a G TtRp{f) and a G Tts‘t{f) then RP = 5^ or 
RP C 5^ is in Luid- 
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We begin by eheeking the first eondition. Let a € dom(/) and G Pos(a) sueh that a € nRp{l'), 
and let F be a faet of I' that witnesses it. If F is a faet of I then a G TIrp (7) and a does not witness a 
violation of weak-soundness. So F is a faet of I'\L Let i be the index of the 7, that eontains F, and a 
be sueh that F = Fj (this is uniquely defined aeeording to Lemma C.3). 

We eannot have RP G Dng(n,), beeause we would then have nRp(F) G F", eontradieting a G dom(7). 
We eannot have RP G NDng(n,) either, beeause then a = 71rp{F) would imply that a G 71rp{I) whieh 
we already exeluded. Henee RP G IT,. Now, by definition of PI being a pieeewise realization, as a G F,, 
we know that a G 71rp{I) or a G \Nants{P,RP). But we exeluded a G 71rp{I) above, and we assumed 
a G dom(7), so a G Wants(F,F^) translates to a G Wants(7,F^). Henee, a does not witness a violation 
of weak-soundness. 

We now eheek the seeond eondition. Let a G dom(7')\dom(7) and RP,S‘^ G Pos(a) sueh that a G 
7rRp(7') n%<?(7'). We must show that RP = S‘^ or RP C S‘^ is in Luid. that is, RP ~id Now either a£T, 
or a € PL. If a G F, observe that elements of F oeeur at only one position in 7'. Henee, neeessarily 
RP = whieh implies RP ~id S‘^, and a does not witness a violation of weak-soundness. Thus, a^PL. 

Let F be a faet witnessing that a G nRp{P), and F' a faet witnessing that a G 7159 ( 7 '). As a £ PL, 
neeessarily F and F' are faets of 7'\7, so there are i and /' sueh that F and F' are respeetively faets of 7,- 
and pi. Clearly a eannot oeeur in F or F' at a position of Dng(n,) or Dng(n,9) (they eontain elements 
of F) or at a position of NDng(n;) or NDng(nj/) (they eontain elements of dom(7)). Henee, RP G H,- 
and 5^ G H,/. Now, as PI is a pieeewise realization, as a ^ dom(7), we eonelude that a G Wants(F,F^) 
and a G Wants(F,5'^), and as a ^ dom(7) this implies that RP G A (a) and 5^ G A (a), so that RP ~id 5^, 
and a does not witness a violation of weak-soundness. 

Henee, 7' is weakly-sound. 

Let us now show that 7' \= Lufd- Assume to the eontrary the existenee of two faets F and F' that 
witness a violation of a U FD ^ : RP ^R‘i of Lufd- As 7 |= Lufd, we assume without loss of generality 
that F is a faet of 7'\7; let 1 < / < n and a G F; be sueh that F = Fj. We eannot have RP G Dng(n,), 
as then we would have Up G F, and elements of F only oeeur in a single faet in 7'. We eannot have 
RP G n, either beeause, by Lemma C.3, Fj is the only faet of 7' where Up oeeurs at position RP. So 
RP G NDng(n;), and by Lemma C.2 we have G NDng(n;) as well. Henee, letting F" = R{c) be the 
faet of 7 used to fill the positions of NDng(n;) in F, we know that a'p = Cp and a'^ = Cq. Thus, as this 
makes it impossible that F' = F", we deduee that F" and F' also violate <j). 

Now, either F' is also a faet of 7 and we have a eontradietion beeause F" G 7 but 7 \= Lufd, or it is a 
faet of 7'\7 and, by the same proeess that we applied to F, we ean replaee it by a faet of 7, reaehing a 
eontradietion again. This proves that 7' |= Lufd- 

Let us last show that 7' \= Luid- Assume to the eontrary the existenee of a UID T : C of Luid and 
an element a G dom(7') sueh that a G nRp{I')\7ls‘i{I'). Let F be a faet of 7' witnessing that a G nRp{I'). 
Either F is a faet of 7 or it is a faet of 7'\7. 

For the first ease, if F is a faet of 7, by definition of PI being a realization, we have a G nRp{PI). As 
PI is ZuiD-compliant, we have a G TlsiiiPI), and letting a be the witnessing tuple in F,- where H,- is the 
•G^FUN-class of S‘^, we know that either a G 71$'’ {1) or a G 7159 (Fj). In the first sub-ease there is nothing to 
show. In the seeond sub-ease it suffiees to show that Fj was indeed ereated, and this is the ease beeause 
T witnesses that H, is inner, and F G 7 witnesses that F was aehieved in Chase(7,ruiD)> so S must also 
be because of T. This concludes the first case. 

For the second case, if F is a fact of 7'\7, write F = Fj . The existence of Fj implies that H/ 
is inner and F is achieved in Chase(7,ruiD); hence S is, because of T. There are three possibilities: 
RP G NDng(n;/), RP G H,/, or RP G Dng(n,/). The first sub-case is RP G NDng(n,/); but then we could 
have picked as witness for a G 71rp{I') the fact S{c) of 7 used to define fhe non-dangerous positions, and 
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we are baek to the first ease. The seeond sub-ease is G IT,/ ; then we have a G TZrp {PI) by eonstruetion, 
so that as PI is ZuiD-compliant we have a G ns‘i{PI), and we eonelude as before. The only remaining 
sub-ease is the third sub-ease, G Dng(n,/), so that ap G T. Now, as R^ G Dng(n,/), we know that 
RP —> /?'" is in Tufd for any position of IT,/ that oeeurs in Tuid (sueh an /?'" exists beeause IT,/ is 
inner). Now, as z witnesses that RP oeeurs in Tuid, we know by assumption reversible that R’^ —> RP is 
in ruFD> so that RP G IT,/. But we assumed RP G Dng(n,/), a eontradietion. 

Henee we eonelude that /' \= Tuid- 

Henee, I' is a finite superinstanee of I whieh is weakly-sound and satisfies Zy- This eoneludes the 
proof. 

D. Proofs for Section V: ^-Soundness and Reversible UIDs 

This seetion eompletes the proof of the Aeyelie Unary Models Theorem (Theorem III. 6) under assump¬ 
tion reversible. 

D.l. Proof of Lemma V.2 (ACQs are preserved through k-bounded simulations) 

Lemma V.2. For any instance I and ACQ q of size < n such that I \= q, if there is an n-bounded 
simulation from I to I', then I' ^ q. 

Fix the instanee 1. We will prove by induetion on n the following stronger elaim: for any n G N, for 
any ACQ q of size < n and any variable x of q, if q has a mateh in I that maps x to a G dom(/), then for 
any b G dom(/') sueh that {I,a) <„ {I',b), q has a mateh in I' mapping x to b. The base ease of n = 0 
eorresponds to queries with no atoms, and it is trivial. 

For the induetion step, fix n G N, fhe query q, the variable x and the mateh h from ^ to / that maps x 
to a G dom(/). We define a reaehability relation between variables of q as the reflexive and transitive 
elosure of the relation of eo-oeeurring in some atom of q. If this relation eonsists of a single elass, 
we say that q is connected. As we ean otherwise rewrite ^ as a eonjunetion of strietly smaller queries 
of ACQ and proeess all sueh queries separately using the induetion hypothesis, we assume without loss 
of generality that q is eonneeted. 

Let A = Ai,... ,Am be the atoms of q where x oeeurs (this set of atoms is non-empty, by the eonneet- 
edness assumption). Beeause q is an ACQ, eaeh variable y oeeurring in one of the A, oeeurs at most 
onee: onee per atom (as the same variable eannot oeeur multiple times in an atom), and in only one 
atom (as if y oeeurs both in A,j and A/j then A/j, y, Ai^, x is a Berge eyele of q). Let Y be the set of the 
variables oeeurring in the A,- (not ineluding x). 

Beeause q is aeyelie and eonneeted, the other variables of q ean be partitioned depending on the 
variable in Y from whieh they are reaehable without using A. Henee, we ean partition the remaining 
atoms of q into strietly smaller aeyelie subqueries q\ (yi ,zi), ■ ■■, qi{yi,Zi) in ACQ, for T = {yi,... ,y/}, 
where the zj are pairwise disjoint sets of variables. 

Now, let b G dom(/') be sueh that (7,a) <„ il',b). For eaeh atom A,- = R{x) in A, let 1 < F; < 1^1 
be the one position sueh that Xp. = x. Consider the faet F, = F(a,) that is the image of A/ in / by h. 

As {lij- there exists a faet of I with bpj — h and with {l^ciq^ _i {I',bg) for 

all 1 < q < |7?|. Consider now eaeh variable yj G Y that oeeurs in A,-, letting 1 < q < |7?| be the 
one position sueh that Xq = yj, and let qj{yj,Zj) be the subquery eorresponding to yj. We know that 
{I,aq) < 1 ^ I {I',bq), and that qj has a mateh in 7 that maps yj to aq (namely, the restrietion hj of the 
mateh h to the subquery qj) so that, by the induetion hypothesis, qj has a mateh h'j in I' where yj is 
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matched to bq. Now, we can assemble the p! and all the matches h'j thus obtained, because the Zj are 
pairwise disjoint, yielding a match h' of q in P where x is matched to b. This concludes the induction 
step. 

Hence, the stronger claim is proven by induction. It remains to observe that it implies the desired 
claim. Indeed, if 7 |= ^ and there is a n-bounded simulation sim from 7 to I', choose any variable x in 
q (if q has no variables, the result is vacuous), consider any match of ^ in 7 matching x to a, use sim to 
define b := sim(a), and deduce the existence of a match of q in 1' (matching x to b) using the claim that 
we have shown by induction. 

D.2. Proof of Lemma V.5 (AFactCI is finite) 

Lemma V.5. For any initial instance Iq, set Luid of UI Di, and ^ G N, AFactCI is finite. 

We first show that has only a finite number of equivalence classes on Chase(7o,ruiD)- Indeed, for 
any element a G dom(Chase(7o,ruiD)). by the Unique Witness Property, the number of facts in which a 
occurs is bounded by a constant depending only on and Luid- Hence, there is a constant M depending 
only on fi, Luid, and k, so that, for any element d G dom(Chase(7o,ruiD)). the number of elements 
of dom(Chase(7o,ruiD)) which are relevant to determine the ~i;-class of d (that is, the elements whose 
distance to d in the Gaifman graph of Chase(7o,ruiD) is < k) is bounded by M. 

This clearly implies that AFactCI is finite, because the number of m-tuples of equivalence classes 
of that occur in Chase(7o,ruiD) is then finite for any m < maxRgfj |7?|, and Pos(a) is finite. 

D.3. Proof of the Fact-Saturated Solutions Lemma (Lemma V.6) 

Lemma V.6 (Fact-saturated solutions). The result I of performing sufficiently many chase rounds on Iq 
is such that Jq = (7, id) is a fact-saturated aligned superinstance oflo. 

For every D G AFactCI, let G N be such that D is achieved by a fact of Chase(7o,ruiD) created 
at round njj. As AFactCI is finite, n := maxogAFactci is finite. Hence, all classes of AFactCI are 
achieved after n chase rounds on Iq. 

Consider now 7 q obtained from the aligned superinstance Iq by n rounds of the UID chase, and 
Jq = (7Q,ruiD)- It is clear that for any D G AFactCI, there is an achiever F = R(b) of D in Iq. Hence, 
the corresponding fact in Jq is an achiever of D in Tq- 

D.4. Proof of the Fact-Thrifty Chase Steps Lemma (Lemma V.9) 

We first prove the following lemma, which we will use to justify that we can extend aligned instances. 

Lemma D.l. Let n G N. Let f and I be instances and 5\rr\ be a n-bounded simulation from f to 7. 
Let I 2 be a superinstance off defined by adding one fact F^ = R{a) to f, and let sim^ be a mapping 
from I 2 to I such t/iat simj^^ = sim. Assume there is a fact Tv = P{b) in I such that, for all 7?' G Pos(7?), 
sim^(a,) c^n bi. Then sim^ is a n-bounded simulation from f to 7. 

Proof. We prove the claim by induction on n. The base case of n = 0 is immediate. 

Let n > 0, assume that the claim holds for n — 1, and show that it holds for n. As sim is a n-bounded 
simulation, it is a (n — l)-bounded simulation, so we know by the induction hypothesis that sim' is a 
{n — 1)-bounded simulation. 


32 



Let us now show that it is a n-bounded simulation. Let a G dom(/ 2 ) be an element and show that 
<n (/,sim'(a)). To do this, choose F = S{a) a fact of I 2 with Up = a for some p, and show that 
there exists a fact F' = S{a') of I with a'p = sim'(ap) and {h,^q) <n-i {1^^'q) for all 5^ G Pos(5'). 

The first possibility is that F is the new fact =R{a). In this case, as we have {I, bp) <„ (/,sim'(ap)), 
considering F^, we deduce the existence of a fact F^ = F(c) in I such that Cp = sim'(fl;p) and (/, bq) <n-\ 
{I,Cq) for all 1 Fq< |F|. We take F' = F^. By construction we have Cp = sim^(ap). Fixing 1 < ^ < |F|, 
to show that {h^ciq) <n-i {iFq), we use the fact that sim' is an {n — l)-bounded simulation to deduce 
that {h,ciq) <n-i (f,sim'(a^)). Now, we have (/,sim'(a^)) <„_i {I,bq), and as we explained we have 
(I^bq) <n-i {I,Cq)’ SO wc concludc by transitivity. 

If F is another fact, then it is a fact of /i, so its elements are in dom(/i), and as sim' coincides with 
sim on such elements, we conclude because si m is a n-bounded simulation. □ 

We then prove the main result: 

Lemma V.9 (Fact-thrifty chase steps). For any fact-saturated aligned superinstance J, the result J' of 
a fact-thrifty chase step on J is indeed a well-defined aligned superinstance where the former active 
fact Fa is no longer active. 

We first observe that fact-thrifty chase steps are well-defined because a suitable Fr = ^(c) always 
exists, as J is fact-saturated. It is immediate that f is finite. 

It is immediate that, letting J' = (/',sim') be the result of the process, F is still a superinstance 
of Iq, and the previously active fact Fa is no longer active in F. To show that sim' is still a k-bounded 
simulation, use Lemma D.l with Fn = S{b) and F^ = S{b'). The fact that sim' is the identity on Iq is 
immediate because si =sim|/(,. 

We now show that 7' satisfies Lupd^ using fhe facf fhaf J does. Indeed, any violation of Lufd in 
7' would have fo include fhe one new facf Fn = S{b), By way of confradicfion, lef 0 : 5^ — > 5''' be a 
violated UFD in Lufd and lef {F,Fn} be a violation, where F = S{d) is some facf of F. If is clear fhaf 
we cannof have dq = bq, as ofherwise Ibis would confradicf fhe facf fhaf Fa was an active facf. Hence, 
by consfrucfion of fhe new facf Fn, we can only have bi = di if 5' G NDng(5'^). As {F,Fn} violates 
fhis implies fhaf G NDng(5'^), so fhaf, by Lemma C.2, 5''' G NDng(5''?). Now, observe fhaf we have 
%Dng( 5 «)(Fn) = ^NDngfs?) (F-), wifh Ff fhe facl used fo fill fhe non-dangerous position in fhe definition 
of facf-fhriffy chase sfeps. Now, we cannof have F = F^ because fhey musf disagree on 5''", so fhaf 
{F,Fr} also wifnesses a violation of 0 in 7. This confradicfs our assumpfion fhaf 7 \= Lufd- 

We musf now check fhe lasf pari of fhe definition of aligned superinsfances, which only needs fo be 
verified for fhe fresh elemenls: for S'' S‘^, if br is fresh, Ihen if occurs in 7' af fhe position where sim(f7r) 

was inlroduced in Chase(/o,ruiD)- For fhis, if suffices fo show fhaf b'q was fhe exported elemenl of F^. 
In fhis case, as sim(f7r) = b'^, we will know fhaf f?' was inlroduced al position 5''' in F^ in Chase(/o,ruiD)? 
so fhe condilion is respecled. We make fhis a separale lemma: 

Lemma D.2. Let J be an aligned superinstance of Iq and consider the application of a thrifty chase 
step for a DID T : C 5^. Consider the chase witness F^ = S{b'). Then b'q is the exported element 

ofF^. 

Using fhis lemma, if is also clear fhaf sim'^,y^ maps fo Chase(/o,ruiD) Vo> which is fhe lasf Ihing we 
had fo verify. Indeed, for all fresh elemenls br G dom(/')\dom(/) (wifh 5''' / 5^), which are clearly 
nol in Iq, we have fixed sim'(f7r) to be b'r, which by fhe lemma is inlroduced in F^, so if cannof be an 
elemenl of /q; hence if is indeed an elemenl of Chase(/o,LuiD) Vo- 

We conclude by proving Lemma D.2: 
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Proof. Let Fa = R{a) be the aetive faet in J, = S{b) be the new faet of J', and T : 7?^ C 5^ be the DID, 
so Op = bq is the exported element of this ehase step. Assume by way of eontradietion that b'q was not 
the exported element in F^, so that it was introdueed in F^. In this ease, as sim(ap) = sim(7?^) = b'q, 
by the last part of the definition of aligned superinstanees, we have Op G %'?(/), whieh eontradiets the 
faet that Op G Wants(7, f). Henee, we have proved by eontradietion that b'q was the exported element 
in Fw- □ 

D.5. Proof of the Fact-Thrifty Completion Proposition (Proposition V.IO) 

Proposition V.IO (Faet-thrifty eompletion). Under assumption reversible, for any fact-saturated aligned 
superinstance J of Iq, we can expand J by fact-thrifty chase steps to a fact-saturated aligned superin¬ 
stance J' of Iq that satisfies Luid- 

There are two steps to the proof. The first one is to apply initial ehasing by fresh faet-thrifty ehase 
steps to ensure a eertain property, k-reversibility. The seeond one is to use faet-thrifty ehase steps to 
satisfy Luid^ using the eonstruetions of Seetion IV. 

We start with the first step. We eonsider a forest strueture on the faets of Chase(/o,ruiD): the faets 
of /q are the roots, and the parent of a faet F not in /q is the faet F' that was the aetive faet for whieh F 
was ereated, so that F' and F share the exported element of F. For a G dom(Chase(/o,LuiD)). if was 
introdueed at position S'^ of an S'-faet F = S{a) ereated by applying the DID T : C 5^ (with 5^ S'') 

to its parent faet F', we eall x the last DID of a. The last two UIDs of a are (t, t') where x' is the last 
DID of the exported element aq of F (whieh was introdueed in F'). For n G N, we define fhe lasf n 
UIDs in fhe same way, for elemenfs of Chase(/o,ruiD) infrodueed affer suffieienfly many rounds. We 
say fhaf a is n-reversible if ifs lasf n U I Ds are reversible. 

We aeeordingly define fhe nofion of n-reversible aligned superinstance, whieh requires fhaf elemenfs 
where a DID is violafed are mapped by sim fo a n-reversible elemenf in fhe ehase. Reeall fhaf, for any 
position RP, we write [F^]id the ~iD-class of R^. 

Definition D.3. An aligned superinstance J oflo is n-reversible if for any position 5^ and a G Wa nts(7,5^), 
sim(a) is a n-reversible element o/Chase(/o,ruiD) introduced at a position of ^ Chase(/o,ruiD)- 

The firsl sfep of fhe proof of Proposition V.IO is fo perform k-\-\ fresh fael-lhrifly ehase rounds on 
fhe inpuf fael-saluraled aligned superinsfanee J, fo ensure fhaf fhe resulf J' is ^-reversible for Luid: 

Proposition D.4 (Ensuring n-reversibilify). For any n G N, applying n-\-\ fresh fact-thrifty chase 
rounds on a fact-saturated aligned superinstance J by the UID 5 o/Zuid yields a fact-saturated aligned 
superinstance J' that is n-reversible for Zuid- 

This proposition is proved in Appendix D.6. 

The seeond step of the proof is simply to apply the following lemma to J'. 

Lemma D.5 (Guided ehase). For any fact-saturated k-reversible aligned superinstance J = (/,sim) 
of lo, we can build by fact-thrifty chase steps an aligned superinstance J' = (F,sim^) of Iq such that 
I C F, simj^ = sim, and J' satisfies Zuiq. 

The lemma is proved in Appendix D.7. It uses the eonstruetions of Seetion IV, and relies on an 
independent result about the UID ehase, the Chase Loeality Theorem (Theorem V.ll), proved in Ap¬ 
pendix D.8. Clearly, applying the Guided Chase Lemma to J' eoneludes the proof of the Faet-Thrifty 
Completion Proposition. 
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D.6. Proof of Proposition “Ensuring n-reversibility” (Proposition D.4) 

We first make the following easy observation: 

Lemma D.6. Let J be an aligned superinstance, and J' be the result of applying one chase round 
to J with fresh fact-thrifty chase steps. Let a G Wants(7^,T) for any DID T. Then we have a G 
dom(7')\dom(7), and a occurs in a single fact F (which is an active fact for z). 

Proof For the first part of the elaim, let us assume by way of eontradietion that a G dom(7). Note 
that, by definition of ehase rounds, we eannot have a G Wants(7, t), otherwise we eould not have 
a G Wants(7',T). Henee, if we have a G dom(/) but a ^ Wants(7, t), any aetive faet F witnessing 
a G Wants(7, t) must be in J'. 

Now, by definition of faet-thrifty ehase steps, if a ^ dom(7), there are two possibilities. Either a 
was the exported element in F, or it was an element reused at a non-dangerous position. The first ease 
is impossible: beeause Euid is transitively elosed, the new faets ereated in J' eannot make new UIDs 
applieable to old elements of J. The seeond ease is also impossible: elements reused at non-dangerous 
positions already oeeurred at the same position in J, so this eannot make new UI Ds applieable to them. 
This proves the first claim. 

The second part of the claim is by observing that elements created in J' occur in a single fact, by 
definition of chase rounds, and by definition of fresh fact-thrifty chase steps (elements in new facts are 
either in dom(7) or are fresh). So the one fact where a occurs must be the active fact witnessing that 
a G Wants(7',T). □ 

We then show the following simple lemma about n-reversibility: 

Lemma D.7, Let n G N, let J be a n-reversible aligned superinstance oflo and let F^ = S{b) be a new 
fact obtained by applying a thrifty chase step to J. For all S'" G Pos(S), such that by f dom(7), sim(fir) 
is [n -\- \)-reversible and introduced at position S'" in Chase(/o,ruiD)- 

Proof Let F^ be the active fact, iv be the chase witness, and T : C be the DID for this chase step. 
By Lemma D.2 we know that is the exported element of Hence, for all S'" G Pos(S)\{S^}, b'y is 
{n l)-reversible and introduced at position S'". Now, for all S'" G Pos(S) such that by is fresh in F^, we 
have set sim(fi^) = by, so the result follows. □ 

We now prove the main result: 

Proposition D.4 (Ensuring n-reversibility). For any n G N, applying n-\-\ fresh fact-thrifty chase 
rounds on a fact-saturated aligned superinstance J by the UID 5 o/Zuid yields a fact-saturated aligned 
superinstance J' that is n-reversible for Zuid- 

Eix the aligned superinstance J = (/,sim). We prove the result by induction on n. Eor the base 
case n = 0, letting f be the result of applying one chase round to J, we need only show that for any 
position 5^ and a G Wants(7',5'^), sim(a) was introduced at a position of [^^Jid in Chase(/o,ZuiD)- 
By Lemma D.6, a occurs in a single fact F at some position (so that, using assumption reversible, 
RP ~|D 5^), and we have a G dom(7')\dom(7), so it was created by the application of a thrifty chase 
step to J. By Lemma D.7, we conclude that sim(a) was introduced at position R^ in Chase(/o,ZuiD)> 
which implies the desired claim. 

Eor the induction, fix n > 0 and assume that the result is true for n — 1. Eet J' = (/', sim') be the result 
of applying (n — 1) -|- 1 chase rounds to J. By induction hypothesis, J is (n — 1)-reversible. We want to 
show that J" = (/", sim") obtained by applying one more chase round to J' is n-reversible. This is shown 
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exactly as in the base case, except that, when applying Lemma D.7, we use the {n — 1)-reversibility of J 
to deduce the n-reversibility of the element under consideration. 

This proves the desired claim by induction. Note that we have relied implicitly on the Fact-Thrifty 
Chase Steps Lemma (Lemma V.9) to justify that the result of chase rounds by fact-thrifty chase steps 
are indeed aligned superinstances; it is immediate that fact-saturation is preserved. 

D.7. Proof of the Guided Chase Lemma (Lemma D.5) 

Recall that the ^puN-dasses of Pos(a) are numbered ITi,... ,n„. Recall the notion of inner and outer 
•f-^PUN-classes (Definition C.l), and the notion of piecewise realization (Definition IV. 14). We define: 

Definition D.8. A superinstance I' of the instance 1 follows the piecewise realization PI = (Ki ,... ,Kn) 
if for every inner -ir^puf^-class IT;, we have TtUjil') ^ Ki. 

We show fhe main claim: 

Lemma D.5 (Guided chase). For any fact-saturated k-reversible aligned superinstance J = (/,sim) 
of Iq, we can build by fact-thrifty chase steps an aligned superinstance J' = (/',sim') of Iq such that 
I C simj^ = sim, and J' satisfies Luid- 

Fix fhe facl-saluraled ^-reversible aligned superinsfance J = (/,sim) of /q. Lef P = be a 

balanced pssinsfance of J obfained by fhe Balancing Lemma (Lemma IV.9) and lef PI = (.^fi,... ,Kn) be 
a finite Zu-complianf piecewise realizafion of P obfained by fhe Realizafions Lemma (Lemma IV. 16). 

We will prove fhe resulf by salisfying DID violafions in J wifh facl-lhrifly chase sfeps using fhe 
piecewise realizafion PI, yielding a finite aligned superinsfance 7f = (/f,simf) such fhaf I C f, the 
restriction of sinif to I is sim, 7f satisfies Luid, and f follows PI. The process is a variant of Lemma 
“Using realizations to get completions” (Lemma IV.21). 

We call f = (/', sim') the current state of our superinstance, starting at f := J. We will perform fact- 
thrifty chase steps on J'. We call T the set of all fresh elements (not in dom(F)) that we will introduce 
(only in outer classes) during the chase steps. It is immediate that our construction will maintain the 
following: 

fsat: f is a fact-saturated aligned superinstance of Iq (this uses Lemma V.9); 

sub: /C/'; 

sim: simj^„^(^) = sim. 

Further, we will additionally maintain the following invariants: 
fw: I' follows PP, 
krev: J' is ^-reversible; 

out: elements of outer classes are only in K or in dom(/). 

We now describe formally how we apply each fact-thrifty chase step. Choose an element a G 
Wants(7',T) to which some UID z : RP F S‘^ is applicable. Let = R{a) be the active fact, with 
a = ap. The UID T witnesses that the -H-puN-classes IT,- and IT,/, of RP and 5^ respectively, are inner, so 


36 



by invariant fw we have a G 71rp{PI). As PI is Zuid- compliant, we must have a G ns‘i{PI), and there is 
a I n,' I-tuple t G Kj! sueh that tq = a. 

We ehoose a faet = S{c) of J that aehieves the faet elass of the ehase witness iv (this is possible 
by invariant fsat), and ereate a new faet Fy^ = S{b) with the faet-thrifty ehase step defined as follows: 

• For the exported position S‘^, we set bq ■= ap. 

• For any 5''" G IT;/, noting that neeessarily 5''" G Dng(5'^), we set br ■= tr- 

• For any position 5''" G Dng(5'^)\n,v, we take br to be a fresh element from F. 

• For any position 5''' G NDng(5'^), we set br '■= Cr- 

We must verify that this satisfies fhe eondifions of fhriffy ehase sfeps. The faef fhaf br G for 

S’^ G NDng(5'^) is immediafe by definifion of F^. We now show fhe fwo ofher poinfs. 

Firsf, we show fhaf br ^ %r(7') for S'' G Dng(5'^). Obviously fhis needs only fo be eheeked for 
S'^ G n,/ (as fhe ofher br are always fresh). Assume fo fhe eonfrary fhaf tr G %'■(/), and lef F = S{d) be 
a wifnessing faef. As n,v is inner, by invarianf fw, we deduee fhaf 7lu.,{d) G Tin., (PI)- Now, as dr = tr 
and PI is ZuFD-complianf, we deduee fhaf d = t, so fhaf F wifnesses fhaf dq is in nsi{J'). As we have 
dq = tq = a, fhis eonfradiefs fhe applieabilify of T fo a. Henee, fhe elaim is proven. 

Seeond, we eheek fhaf reused elemenfs have fhe righf sim-image. This is fhe ease by definition of 
fael-lhriffy ehase steps for fhe non-dangerous posifions, so again we need only eheek fhis for elemenfs 
af a posifion S'' G IT,/, and only if fhey are nol fresh. We sfarf by showing fhaf, for sueh S’’, we have 
br G Wants(7',5'''). 

Indeed, we have br = tr whieh is in %/(/’/), and we eannof have t G Tin, {J'), as ofherwise fhis would 
eonfradief fhe applieabilify of T fo a; so in parfieular, by invarianf sub, we eannof have f G 7rn,,(/)- 
Thus, by definifion of a pieeewise realization, we have tr G Wants(/’,5''‘). Reealling fhaf we have 
tr G dom(7'), we show fhaf fhis implies tr G Wants(7',5''’). Reealling fhe definition of tr G Wants(R,5'''), 
we distinguish fwo subeases: (1.) tr G dom(7) and tr G Wants(7,5''‘), or (2.) tr GFL and S'' G A(f,.). 

In fhe subease (1.) tr G dom(7) and tr G Wants(7,5''^), we remember fhaf in fhe firsf poinf we showed 
fhaf tr ^ ns'{J')- So we still have tr G Wants(7',5''‘), whieh is whaf we elaimed. 

In fhe subease (2.) tr ^FL and S’^ G X{tr), eonsider a faef F' of J' wifnessing tr G dom(7'), where tr 
oeeurs af a posifion lef IT,// be fhe G-^puN-dass of TK As tr G FL, by invarianf out. Yip is inner, so 
by invarianf fw fhere is a fuple t' of Kp sueh fhaf t'l = tr- Now, as tr G FL, by definifion of pieeewise 
realizations, we have G F{tr). Henee, eifher fhe UID t' : r' C S'' is in ^uiD or we have = S'". As 
tr G Ttj! {J') and we have shown in fhe firsf poinf fhaf tr ^ 7ts'{J'), we know fhaf / S'', so x' is in Zuid- 
Henee, as F' wifnesses fhaf tr G 7lpi{J'), and as tr ^ 7ts'-{J'), we have tr G Wants(7',5''’), as we elaimed. 

Henee, we know fhaf br = tr is in Wants(7',5''') in eifher subease. By invarianf krev, fhis implies fhaf 
sim(tir) is a k-reversible elemenf of Chase(/o,ZuiD) infrodueed af a position of [5 ''']id- By Lemma D.7, 
we know fhaf fhe sim-image of a fresh elemenf af posifion S'' would be k-reversible and infrodueed 
af posifion 5''". Henee, by fhe Chase Loealify Theorem (Theorem V.ll), we have sim(ti,-) ti', so 
fhe eondifion is safisfied. This proves fhaf, indeed, we ean perform fhe faef-fhriffy ehase step fhaf we 
deseribed. 

We now eheek fhaf fhe invarianfs are preserved. We firsf observe fhaf for any 5''" G Dng(5'^)\n,/, 
fhe GGpuN-class of 5''" is outer. Indeed, if S'^ oeeurred in Zuid> as S'^ does beeause of x, we know by 
assumption reversible fhaf, as fhe UFD 5''" —)■ S^ is in Zufd by dangerousness of 5''", fhe UFD 5^ ^ 5''" 
also should, buf fhen we would have S'' Ofun so 5''' G H,/, a eonfradiefion. Henee, fhe OpuN-dass 
of 5''" is indeed oufer. 
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R{7!,a',y) T{b',c') 


Figure 1 


y{ej) S{y\z!) 

: Chase loeality example. Elements b and b' are 1-reversible and introdueed at positions R? 
and U^. Reversible UIDs are represented by thiek edges. 


Now, invariant fw is preserved beeause, by the above observation, the new faet is defined on 
the inner elasses either following t or following an existing faet of J'. Invariant krev is preserved 
by Lemma D.7 for the fresh elements, or by krev on the previous state J' for the existing elements. 
Invariant out is preserved beeause the only elements of that are not in F or in dom(/) are those of 
n,/, whieh is inner. This shows that the invariant is preserved by the faet-thrifty ehase step. 

We perform faet-thrifty ehase steps until no violations of Euid remain: invariant fw guarantees that 
we terminate. Indeed, PI is finite, the domain of the resulting instanee is bounded by that of PI for all 
inner elasses, and new elements ereated in outer elasses eannot ereate violations of Euid or eause the 
ereation of further elements, by definition of their elass being outer. Henee, the result of the proeess is 
finite, and it satisfies Euid beeause no violations remain. This eoneludes the proof. 

D.8. Proof of the Chase Locality Theorem (Theorem V.ll) 

We give an equivalent rephrasing of the Chase Loeality Theorem (Theorem V.ll) using the notion of 
n-reversible elements (Definition D.3): 

Theorem D.9 (Chase loeality theorem). For any instance Iq, transitively closed set o/ U I Di Euid. cifid 
n G N,/or any two elements a and b respectively introduced at positions RP and S‘^ in Chase(/o,ruiD) 
such that RP ~id 5/ if a and b are n-reversible then a b. 

Note that this result is for an arbitrary set of UIDs and FDs, not relying on any finite elosure proper¬ 
ties, or on assumption reversible. (It only assumes that the last n dependeneies used to ereate a and b 
were reversible.) However we still assume that Luid is transitively elosed. 

Ligure 1 illustrates the result in a simple situation. The intuition is the following: n-reversible ele¬ 
ments in the ehase have the same neighborhoods up to distanee n, no matter their exaet histories, as 
long as they were introdueed in ~iD-equivalent positions: intuitively, the faets that go “downwards” in 
the neighborhood of a in the forest strueture ean be matehed to faets in the neighborhood of b beeause 
they are required by Euid> and the faets “upwards” are also matehed up to distanee n beeause of the 
reverses of the UIDs used along this ehain. 

To prove the theorem, fix the instanee Iq and the set Luid of UIDs. We first show the following easy 
lemma: 
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Lemma D.IO. For any « > 0 and position RP, for any two elements a,b o/ Chase(/o, Luid) introduced 
at position RP in two facts Fa and Ft, letting a' and b' be the exported elements of Fa and Fj,, if a' ~„-i b', 
then a b. 

Proof By symmetry, it suffices to show that a <„ b. We proceed hy induction on n. 

For the base case n = \, observe that, for every fact F of Chase(/o,LuiD) where a occurs at some 
position 5^, there are only two cases. Either F = Fa, so we can pick Fj, as the representative fact, or the 
UID C 5^ is in Euid so we can pick a corresponding fact for b by definition of the chase. 

For the induction step, we proceed in the same way. If F = Fa, we pick Fh and use either the 
hypothesis on a' and b' or the induction hypothesis (for other elements of Fa and Ft,) to justify that Fh 
is a suitable witness. Otherwise, we pick the corresponding fact for b which must exist by definition of 
the chase, and apply the induction hypothesis to the other elements of the fact to conclude. □ 

We now prove the Chase Locality Theorem. Recall the definition of ~id (Definition IV.6). However, 
note that, as we no longer make assumption reversible, while ~id is still an equivalence relation, it is 
no longer the case that all UIDs of Euid are reflected in ~id: the DID C 5^ may be in Euid even 
though '/id 5^ if 5^ C RP is not in Euid- 

We prove by induction on n the main claim: for any positions RP and S‘^ such that RP ~id S‘^, for any 
two n-reversible elements a and b respectively introduced at positions RP and 5^, we have a b. By 
symmetry it suffices fo show fhaf (Chase(/o,EuiD),fl^) (Chase(/o,EuiD),^)- 

The base case of n = 0 is immediate. 

For the induction step, fix n > 0, and assume fhaf fhe resulf holds for n — 1. Fix RP and S‘‘, and lef 
a,b be fwo n-reversible elemenfs infroduced respecfively af RP and S‘‘ in facls Fa and Ft- Note fhaf by 
fhe inducfion hypofhesis we already know fhaf (Chase(/o,ruiD),fl^) <n-i (Chase(/o,ruiD),^); we musf 
show fhaf Ibis holds for n. 

Firsf, observe fhaf, as a and b are n-reversible wifh n > 0, fhey are nol elemenfs of /q. Hence, by 
definifion of fhe chase, for each one of fhem, fhe following is frue: for each facl of fhe chase where 
fhe elemenf occurs, if only occurs af one position, and all ofher elemenfs co-occurring wifh if in a facf 
of fhe chase occur only af one posifion in only one of fhese facls. Thus, lo prove fhe claim, if suffices 
lo conslrucl a mapping <j) from fhe sel N\{a) of fhe facls of Chase(/o,ruiD) where a occurs, lo fhe sel 
Ni{b) of fhe facls where b occurs, such fhaf fhe following holds: for every facl F = T(a) of Ni{a), 
telling T‘^ be fhe position of F such fhaf ac = a (Ihere is only one such posifion by conslrucfion of fhe 
chase), b occurs af position in <p{F) = T{b), and for every i, a,- <„-i bi. 

By conslrucfion of fhe chase (using fhe Unique Wilness Properly), Ni{a) consisls of exaclly fhe 
following facls: 

• The facl Fa = R{a), where ad = a' is fhe exporled elemenf (for a cerlain d), ap = a was infroduced 
af RP in Fa, and for i ^ {p,d}, a, was infroduced af R' in Fa 

• For every DID T : C of Fuid. a U-facl Ff where all elemenfs were infroduced in Ibis facl 

excepl fhe one af position which is a. 

A similar characterization holds for b, wifh fhe analogous nofalion. We conslrucl fhe mapping <j) as 
follows: 

• If RP = 5^ Ihen sel 0(Fa) = Fh', olherwise, as T : 5^ C RP is in Euid> sel ^(Fq) fo be fhe facl Ff. 

• For every DID x RP C V® of Fuid, as RP ~id 5^, by Iransifivily, eilher 5^ = or fhe UID 
x': S‘^ C is in Fuid- In the first case, set 0(F/) = Fh. In the second case, set 0(F/) = Ff'. 
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We must now show that (p satisfies the required eonditions. First, verify that indeed, hy eonstruetion, 
whenever a oeeurs at position in F then b oeeurs at position in (j)(F}. Seeond, fix F G Ni(a}, 
wrife F = T(a) and (j)(F) = T(b), wifh Uc = a and be = b for some c, and show fhaf a, <„-i bi for all 
T’ G Pos(r). If n = 1 fhere is nofhing fo show and we are done, so we assume n > 2. If / = c fhen fhe 
elaim is immediate hy fhe induefion hypofhesis; ofherwise, we distinguish fwo eases: 

1. F = Fa {so fhaf T = R and c = p), or F = F^ sueh fhaf fhe UID T : C is reversible. In 

fhis ease, hy eonsfruefion, eifher <p{F) = Fy or ^{F) = F^ for t' : 5 ^ C F is fhen reversible, 
because ~id and R^ ~id T'^. 

We show fhaf for all 1 < / < |r|, / 7 ^ c, a; is {n — l)-reversible and was infroduced in Chase(/o,ruiD) 
af a posifion in fhe ~iD-class of T\ Once we have proved fhis, by symmefry we can show fhe 
same for all bi, so fhaf we can conclude fhaf a, <„-i bi by induefion hypofhesis. To see why fhe 
claim holds, we distinguish fwo subcases. Eifher a, was infroduced in F, or we have F = Fa, 
i = d and a, is fhe exporfed elemenf for a. 

In fhe firsl subcase, a,- was created by applying fhe reversible UID T and fhe exported elemenf a 
is n-reversible, so a, is {n — 1 )-reversible (in facf if is {n + 1 )-reversible), and is infroduced af 
posifion T\ In fhe second subcase, a, is fhe exporfed elemenf used fo creafe a, which is n- 
reversible, so a, is {n — 1 )-reversible; and as n >2, fhe lasf dependency applied fo creafe a, is 
reversible, so fhaf a, was infroduced af a posifion in fhe same ~iD-class as T‘. Hence, we have 
proved fhe desired claim in fhe firsl case. 

2. F = Fa such fhaf T : 7?^ C is nof reversible. In fhis case, we cannof have = 5^ (because we 
have RP ~id S‘^), so fhaf 0(F) = F^, and all a,- for i ^ c were infroduced in F af position F', and 
likewise for fhe bi in 0(F). Using Lemma D.IO, as a ~n-i b, we conclude fhaf a, bi, hence 

Ui <n-l b. 

This concludes fhe proof. 

E. Proofs for Section VI: Arbitrary UIDs: Lifting Assumption 
Reversible 

This appendix proves fhe claims needed fo complete our proof of Theorem III.6, fhe existence of uni¬ 
versal insfances for UIDs, UFDs, and acyclic CQs of fixed size. The main claim is fhe existence of 
manageable partitions (Lemma VI.5). 

Remember fhaf we are assuming fhe “Unique Wifness Properly” (Section II) and fhaf fhe conslrainls 
Zu are closed under fhe finite closure rule (in parlicular, Zuid is fransilively closed). 

E.l. Finite closure computation algorithm 

Lor convenience we recall here how fhe finite closure is compuled, from [8]. 

Given a sel Z = ZpD UZuid of FDs and UIDs, an ID path of Z is a sequence of UIDs of Zuid of fhe 
following form: F)* C R^ ^R't^ C R ^^,... C rI", wifh 4 / jk for all k. The palh is functional if, for 
all 1 < k < n, R'jp —> F* G Zpo- Note fhaf our definilion of fhe ^ relalion ensures fhaf z ^ z' iff T, t' 
is a funclional ID palh. 
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An invertible cycle C of £ is a functional ID path with Rn = R\ and jn = j\ (so that > /?j‘ G £fd): 
a UID that occurs in an invertible cycle is said to be invertible. The reverse C of an invertible cycle C is 

T)Jn r- p72 r- p4 

' '>^1 —^ 1 - 

Applying the cycle closure rule in £ means taking every invertible cycle C of £ and adding to £ the 
UIDs and UFDs needed to make C an invertible cycle in £, namely, R^ C 7?^* C C R!^z\, 

and > R^^ for 1 <k<n. The finite closure is computed by closing under the rule above and by 
implication of the UIDs and of the FDs in isolation. 

The fact that the result is exactly the finite closure of £ is shown in [8]. 

E.2. Proof of Lemma VI.2 (New violations follow 

Lemma VI.2. Let J be an aligned superinstance of Iq and J' be the result of applying a thrifty chase 
step on J for a UID T o/£uid- Assume that a UID t' o/£uid was satisfied by J but is not satisfied by J'. 
Then T ^ z'. 

Fix J, J' and z : RP S‘^ and z'. As chase steps add a single fact, the only new UID violations in 
J' relative to I are on elements in the newly created fact = S{b), As £uid is transitively closed, 
can introduce no new violation on the exported element bq. Now, as thrifty chase steps always reuse 
existing elements at non-dangerous positions, we know that if 5''' G NDng(5''?) then no new UID can be 
applicable to br. Hence, if a new UID is applicable to br for S'" G Pos(5'), then necessarily S'" G Dng(5'^). 
By definition of dangerous positions, the UFD 5''' —)■ 5^ is in £ufd? and it is non-trivial because 5''' / 5^. 
Hence, writing z' \ S’’ F we see that z ^ z'. 

E.3. Proof of Corollary VI.4 (Dealing with trivial classes) 

Corollary VI.4. For any trivial class {t}, performing one chase round on an aligned fact-saturated 
superinstance J oflo by fresh fact-thrifty chase steps for z yields an aligned superinstance J' oflo that 
satisfies z. 

Fix J, J' and z. All violations of T in 7 have been satisfied in J' by definifion of J', so we only 
have fo show fhaf no new violafions of z were infroduced in J'. Buf by Lemma VI.2, as z >/^ z, each 
fresh facl-lhrifly chase step cannof infroduce such a violation, hence fhere is no new violafion of z in 
J'. Hence, J' \= z. 

E.4. Proof of Lemma VI.5 (Existence of manageable partitions) 

Our goal in Ibis section is fo show: 

Lemma VI.5, Any conjunction £uid 0 /UID 5 closed under finite implication has a manageable parti¬ 
tion. 

We assume fhaf £uid is closed under fhe finite closure rule (see Appendix E.l). Hence, in parficular, 
if is fransifively closed. 

We sfarf by infroducing definifions abouf fhe ^ relation, which we recall is defined so fhaf z ^ z' 
for T, z' G £uid whenever T, z' is a funcfional ID pafh, namely: telling z : RP CS ‘^ and z' : S '' F r“, fhe 
UFD V — 7> 5^ is non-lrivial and is in £ufd- 

We extend ^ fo sefs of UIDs in fhe expecled way: P ^ P' if fhere exisls z £ P, z' € P' such fhaf 
z ^ z'. 
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Definition E.l. The ID graph r(ruiD) is the directed graph (with self-loops) defined on Tuid by the 
^ relation. Wfe define the strongly connected components o/r(ZuiD) cis usual: an SCC is a maximal 
subset P o/Euid such that for all G P, we have T where denotes the transitive and 

reflexive closure of the ^ relation. The SCC graph G(ruiD) is the directed acyclic graph (without 
self-loops) defined on the SCCs o/r(ruiD) such that, for any two SCCs P f^P' o/r(ruiD). there is an 
edge from P to P’ ijfP ^ P'. 

Note that the definition of SCCs allows both singleton SCCs {t} where we have a self-loop (t ^ t), 
and singletons where there is none (z z). We say that an SCC is trivial if it is a singleton without 
self-loops. Otherwise, if the SCC is not a singleton or if it has a self-loop, we call it non-trivial. 

We first show the following lemma to understand the strueture of the SCCs of r(ruiD)- This lemma 
is proved in Appendix E.5. 

Lemma E.2 (SCC strueture). The SCCs o/r(ruiD) ^ire transitively closed sets o/ UID 5 . Further, for 
any non-trivial SCC P, letting P^^ := | z G /*}, all UID 5 of P^^ are in Euid. ^^f^d P ^ is an SCC 

ofFCLuiu). 

Note that E and as SCCs of r(ruiD)? may be equal or disjoint. We aeeordingly eall self-inverse 
an SCC P that is non-trivial but satisfies P = P non-trivial SCCs sueh that P and P^^ are disjoint are 
ealled non-self-inverse. 

Given the strueture of the SCCs, the first step to eonstruet a manageable partition is to eonstruet a 
topologieal sort of the SCC graph G(ruiD) of r(ruiD)? but with an additional property, motivated by 
what we showed in Lemma E.2: 

Definition E.3. A topological sort of is inverse-sequential if, for any non-self-inverse SCC P, 

the SCCs P and P ^ are enumerated consecutively. 

The first result, proven in Appendix E.6, is to justify that we ean indeed eonstruet an inverse- 
sequential topologieal sort of the SCC graph of r(ruiD): 

Proposition E.4 (Inverse-sequential topologieal sort). For any conjunction Euid 0 / U I closed under 
finite implication, G(ruiD) has an inverse-sequential topological sort. 

The seeond step is to eonstruet the manageable partition itself from the inverse-sequential topologieal 
sort. Here is how we define the ordered partition from the topologieal sort: 

Definition E.5. An inverse-sequential topological sort defines an ordered partition {P\,... ,Pn) o/Euid. 
in the following way: each class Pi of the partition either corresponds to one SCC o/G(Euid) (which 
is either trivial or self-inverse), or to the union of an SCC and its inverse SCC (which were enumerated 
consecutively because the topological sort is inverse-sequential). It is immediate that (P\,...,Pn) is 
indeed an ordered partition, as it is constructed from a topological sort by merging some classes that 
were enumerated consecutively. 

The seeond result is to show that the resulting ordered partition is indeed a manageable partition. In 
other words, we must show that the elasses of the partitions are either trivial, or that they are a set of 
DID that is transitively elosed and satisfies assumption reversible. 

Proposition E.6 (Manageable partitions from sorts). For any conjunction Euid 0 / U ID 5 closed under 
finite implication, letting P be an ordered partition obtained from an inverse-sequential topological 
sort o/G(Euid). P is a manageable partition. 

This seeond result is proven in Appendix E.7 and eoneludes the proof of our original elaim. 
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E.5. Proof of the SCC Structure Lemma (Lemma E.2) 

Lemma E.2 (SCC structure). The SCCs o/r(ruiD) are transitively closed sets o/ UID 5 . Further, for 
any non-trivial SCC P, letting | T G P}, all UID 5 of P^^ are in Euid. and P^^ is an SCC 

o/r(ruiD)- 

We first show an general lemma: 

Lemma E,7. Let P be a non-trivial SCC o/r(ZuiD)- Par any z,z' G P, there is an invertible cycle 
0 /UID 5 ofP in which T and t' occur. 

Proof Because B is a non-trivial SCC, we have T x' and x' X, and the desired invertible cycle 
is obtained by concatenating the functional ID paths from x to x', and from x' to X. Because P is an 
SCC, it is immediate that the UI Ds of the resulting path are all in P. □ 

We then divide our claim in two lemmas: 

Lemma E. 8 . Let P be an SCC o/r(ZuiD)- Then P is closed under the transitivity rule. 

Proof. Let P be an SCC. If P consists of a single UID, then transitivity is immediately respected, so 
we assume that P contains > 1 UIDs. In particular, P is non-trivial. Let x : RP CS‘^ and x': S‘^ C T’’ be 
two UI Ds of B with RP 7 ^ T’’. As Euid is closed under transitivity, we know x" : RP CT'^ is in Luid- We 
show that t" G P. 

As B is a non-trivial SCC, there is a functional ID path t' = Ti ^ ^ = T, where Xi G P 

for all 1 i n. Because of the UFDs that must be in Lufd to make it a functional ID path, it is 
immediate that the following two paths are functional ID paths as well: x" ^ X 2 ^ ^ Xn and 

-j-j ^ ... w Xm-i ^ x" . Thus we have x" X, and x' x" where T, x' G P, so that x" G B by 
definition of an SCC. □ 

Lemma E.9. Let P be a non-trivial SCC o/r(ZuiD). and let P~^ := | X G P}. Then P ^ C Zuid. 

and P^^ is an SCC o/r(ruiD)- 

Proof. We first prove that, for any T G P, x~^ G Zuid- This is a direct consequence of Lemma E.7: 
there is an invertible cycle of P containing x, so that by definition of an invertible cycle, x^^ is in Zuid- 
We now turn to the second part of the claim. 

First, we show that for any two T, x' G P^\ there is a functional ID path from x to x', so that P^^ is 
strongly connected. This is clear: by Lemma E.7, there exists an invertible cycle C of P containing x^^ 
and (tO ^ ^ the reverse C of this cycle is also an invertible cycle, because Zu is finitely closed; 

C is then a cycle of UIDs of P^' containing x and x'. 

Second, we show that for any UID T G Zuid> if P~^ "f and x P^^ then x G P~^ Consider 
such a UID T, and let : t' = ^ ^ = T and p 2 : "f = "f" ^ ^ = t" be the witnessing 

functional ID paths, with t',t" G P^^ We showed in the previous paragraph that P^^ is strongly 
connected: consider a (possibly empty) functional ID path pj, from x” to x' witnessing the fact that 
t" X. Concatenating pi, p 2 and p 3 yields an invertible cycle C, so that because Zu is finitely closed, 
its reverse C is also an invertible cycle. But C witnesses the fact that (t^O ^ ^ and x^^ (tO ^ • 

Now, as (P) ^ ^ C P and P is an SCC, we have x^^ G P, so that T G P^\ the desired claim. Hence, 

P^' is both strongly connected and maximal, so it is an SCC. □ 

This concludes the proof. Note that, as P and P^^ are both SCCs of r(ZuiD)> either they are equal 
or they are disjoint. We observe that both cases may occur: 
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Example E.IO. Consider the UID 5 z : C and z' : C /?', and the UFD 5 (j) : ^ and 

0' : 5 ' —> S^. z,z' is an invertible cycle, so that by the finite closure rule, the UID 5 z^^ and 
and the reverse UFD^ are implied. However in r(ruiD) 'we have z ^ z', z' ^ z, z^^ ^ ("^0 ^ 
^ SO that and bv'^ disjoint SCCs. 

Consider now the UID5 T : C 5^, z~^ : C Rfi, z' '.R^ C R^, z" : C 5'\ and the UFD5 R^ —?> R?', 
R^ R^, and —>5''. FFe can construct the invertible cycles z' and z", so that and 

{z”)^^ are implied by the finite closure rule. However, besides z' ^ z', z" ^ z", ^ ("^0 ^ 

^ {z'')^^, it is also the case that z ^ z", z” ^ z^^ ^ z' and z' ^ z, and using the 

reverse UFD5 the same is true of the inverses of z', z", and So in fact there is only one 

seeP = {z,z-\z',{z')-\z",{z")-'^}, withp-^=p. 


E.6. Proof of the Inverse-Sequential Topological Sort Proposition 
(Proposition E.4) 

We now prove that G(ruiD) has an inverse-sequential topologieal sort: 

Proposition E.4 (Inverse-sequential topologieal sort). For any conjunction Euid 0 /UID 5 closed under 
finite implication, G(ruiD) has an inverse-sequential topological sort. 

For this we need the following observation about G(ruiD): 

Lemma E.ll. Let P be a non-self-inverse SCC and consider z € ruiD\(PUP^^) such that z ^ P. 
Then one of the following holds: 

• we have z ^ P ^ 

• the SCC of z is trivial, and for any Zp G Luid ^uch that Zp ^ z, we have Tp P^^. 

Proof. Fix z G ruiD\(P and assume that we have z ^ P, i.e., z ^ z' for some z' £ P. As F 

is non-trivial, using Lemma E.7, eonsider the predeeessor of z' in an invertible eyele eontaining 
z' (possibly = zf. Let RP be the seeond position of z, R^ be the first position of z' , and /?'' be the 
seeond position of Note that we have /?'' 7 ^ R^ beeause ^ z' , and R^ 7 ^ R^ beeause z ^ z' . 
Observe that if RP 7 ^ R’’, then z ^ beeause F'' —)■ R^ and R^ RP hold in Lufd (as these 

UFDs are used in an invertible eyele) and Lufd is elosed under transitivity. This proves the elaim, as 
taking z” := G F^^, we have z ^ z”. 

If RP = F'', let P' be the SCC of z. Assume first that P' is non-trivial. In this ease, by Lemma E.7, 
there is an invertible eyele z = Z\,... ,Zm = 't: va P'. But then, we have ^ Z 2 , so that P ^ P', and 
as F' ^ F we have F = F', so T G F, a eontradietion. 

Henee, P' is trivial. Eet S‘^ be the first position of z and F" be the first position of T^_i. We must 
have S‘1 F", as otherwise we have z = so t G F, a eontradietion. Henee, beeause ('f'_j)^^ is in 

ZuiD (as z'^_i G F), by transitivity z” : S‘^ QT'* is in Euid- We ean then see that z" ^ ('^n- 2 ) ^ beeause 
we had ^ ('^n- 2 ) ^ UIDs share the same seeond position; henee, z" ^ F^^ Now 

as z" and z have the same first position, for any Tp G Euid> clearly Zp^ z implies that Tp ^ t" ^ F^^, 
proving the last part of the elaim. □ 

We now eonstruet the inverse-sequential topologieal sort of G(ruiD) by enumerating the SCCs in a 
eertain way that respeets the ^ relation and maintains the following invariant: whenever F is non-self¬ 
inverse, then F and F^^ are enumerated eonseeutively; this guarantees that the result is a topologieal 
sort and that it is inverse-sequential. 
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First, whenever trivial or self-inverse SCCs ean be enumerated, enumerate them. Seeond, whenever 
the SCCs that ean be enumerated are all non-self-inverse, ehoose one sueh P to enumerate. By the 
invariant, P ^ has not yet been enumerated, otherwise P would have been enumerated immediately 
after. We want to enumerate P, and then enumerate P 

To see why this is doable, we must show that, assuming that P ^ P^^, if P ean be enumerated and 
no trivial or self-inverse SCCs ean be enumerated, then P ^ ean also be enumerated. Let P' be a parent 
see of P^^ in G(ruiD) (so that P' ^ P~'), and show that it has been enumerated already. If we have 
P' = P, meaning that P ^ P^^, then this is not a problem, beeause we are about to enumerate first P 
and then P^^, so we may assume that P' ^ P. Henee, P' is different from P and P^^, so it is disjoint 
from it. We apply Lemma E.ll to any T G P'. In the first ease, we also have P' ^ P, so as P ean be 
enumerated, P' was enumerated already. In the seeond ease, P' = {x} is trivial; further, eonsidering any 
P” ^ P', we have P" P, so P" was enumerated already. Henee, all sueh P" are already enumerated, 

so that P' ean be enumerated, but as it is trivial, it must have been enumerated already. Henee, in both 
eases P' was already enumerated unless it is P. This ensures that we ean indeed enumerate P and P^^ 
eonseeutively, maintaining our invariant. Thus, we have eonstrueted an inverse-sequential topologieal 
sort of G(ruiD)- This eoneludes the proof. 

E.7. Proof of the Manageable Partitions From Sorts Proposition 
(Proposition E.6) 

Proposition E .6 (Manageable partitions from sorts). For any conjunction Luid o/ U I Di closed under 
finite implication, letting P be an ordered partition obtained from an inverse-sequential topological 
sort of P is a manageable partition. 

Let (Pi,... ,Pn) be the ordered partition. We prove that it is manageable. Trivial SCCs are indeed 
trivial elasses of the partition, so we must only justify that any other elass P, is transitively elosed and 
satisfies assumpfion reversible. 

We define Pos(P) for P a sef of UIDs as fhe sef of positions oeeurring in P, as in fhe definition of 
assumpfion reversible. We firsf prove a general lemma fo lake eare of fhe seeond parf of fhe assumption: 

Lemma E.12. Let P be a non-trivial SCC o/r(ruiD)- For any two positions R‘ 7 ^ RJ of Pos(P), if 
P* ^ R^ is in Eufd then so is R^ —P*. 

Proof Fix P' and Rf assume lhaf 0 : P' ^ P^ is in Lufd, and show lhaf : P^ —> P' also is. Lef T,- 
be a DID of P where P* oeeurs, and Ty be a DID of P where P^ oeeurs. By Lemma E.7, fhere exisfs an 
inverfible eyele Ci where P' and P^ oeeur. 

We write Ci = P*/ C P^^,... ,R'^ C ph, wifh some I < p,q < n sueh lhaf Rp = Rq= P, and eifher 
ip = i or jp = i, and eifher iq = j or jq = j. By definilion of an invertible eyele, fhe U FDs : R'p RF, 
^p^ : RF —)■ R‘p, ^q : P'? ^ PP and : PP —> R‘f are in Eufd- Thus, beeause Eufd is elosed under 
fransifivily, if is elear lhaf if Iwo positions among S = {RF,R‘p^ rJi are equal (in parfieular, if 
p = q), fhen we have P^ ■H'fun R^ for any fwo posifions P^, P^ in S. Henee, as we know lhaf P' 7 ^ Rf 
and P* and P^ are in S, fhe only ease where we eannof eonelude is fhe one where all fhe posifions of S 
are differenl. 

If all posifions of S are differenl, fhen, beeause of ^p, 0^, ^p^ and by fransifivily of Eufd, we 
know lhaf for any xi,X 2 G {ip,jp},yi,y 2 G {^, 7 ^}, the UFDP^' —>Pt'‘ is in Zufd iff the UFDP^^ Ry 2 
is. Henee, sinee (j) is in Eufd, as i G {ip,jp} and j G {iq,jq}’ we know lhaf P^ —> Pt' is in Eufd for all 
X G {ip,jp},y G {iq,jq), and, lo prove lhaf 0^' is in Eufd, it suffiees lo show lhaf Pt’ —> P^ is in Eufd 
for some x G {ip,jp},y G {iqjq}. 
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So let us construct the cycle C 2 = R‘^ C R^^,... ,R‘^_\ ^ Rq‘',Rp C R'^p^\,. ■ ■ ,Rn C This is an 
invertible cycle, because Rq=Rp= R, and R‘p ^ R^i and the FD /?'p ^ Rj‘i is in Tufd by our assumption. 
Hence, as C 2 is an invertible cycle, and because Zu is finitely closed, the reverse FD R-ii —> R’f is in 
SuFD> which implies that is in Zufd- D 

We then show a lemma to help justify that the classes are transitively closed: 

Lemma E.13. For any non-trivial SCC P, if there is z € P and %' € P such that 7 ^ z' but the 
second position of z is the first position of z', then P = P 

Proof We first observe that we have P P Indeed, as P and P ^ are non-trivial, consider Zq £ P 
and Tq £ P^^ such that Zq ^ z and t' ^ Tq. Letting z" be the UID which is transitively implied by z 
and z', we know that it must be in Zuid as it is transitively closed, and we observe that Zq ^ z^' ^ Tq, 
so that P P^^. 

Now, write z :RP CS‘^ and z' :S‘i C T'', with RP AsP and P^^ are non-trivial, using Lemma E.7, 
we can consider a functional ID path T = Ti ^ T 2 ^ ^ T„ = (t')“\ and a functional ID path 

T^^ = t( ^ ^ T^ = t'. By Lemma E.12, all UFDs along these paths are such that their reverses 

are also in Zufd- Consider now the smallest k>2 such that we have zf^ / '’•'m-k+v ^'^^b a k must exist 
because we have T„ = (t')^^ and z[ = z^^, and we know that z^^ / z'. Consider z” ■■= Zt £ P, and 
z'" := £ P^^, and let S“ and S'’ be respectively the first position of z" and the second position 

of t'": indeed it is easily observed that these positions must be in the same relation S, as this is true for 
T 2 and z’^_Y and is preserved for z” and z”' because we have zf^ = T^_;_|_j for all 1 < 2 < k. 

We now distinguish two cases. The first case is S'’ 7 ^ 5“, and we then have z"' ^ z", so that P ^ ^ P. 
The second case is S'’ = S'*. In this case, z'” and z” are two UIDs of and P such that (t"')^^ 7 ^ z” 
but the second position of z'" is the first position of z". Hence, applying the reasoning of the first 
paragraph to z” and z, we deduce that P^^ P. In either case, as we observed initially that P 
we conclude that P = P the desired claim. □ 

Corollary E,14. For any non-trivial SCC P, PUP^^ is transitively closed. 

Proof. By Lemma E. 8 , P and P^^ are transitively closed. Hence, if no UIDs is transitively implied 
by one UID from P and one from P^^ (or one from P^^ and one from P), then the claim is proven. 
Otherwise, by Lemma E.13, we have P = P^^, so we can conclude by applying Lemma E .8 to P = 
PUP^'. □ 

We now conclude the proof of Proposition E. 6 . Eet P, be a class of the ordered partition (Pi,... ,P«). 
We must show that it is either trivial or reversible. If it is not trivial, then we must show three things: 

• Pi is transitively closed 

• Eor every z £ Pi, we have T^^ £ Pi. 

• Eor every two positions RP,R‘^ £ Pos(Pi) such that RP —)■ is in Zufd, ^ P^ is also in Zufd- 

Eor the first claim, as P, is not trivial, it is either a self-inverse SCC P of r(ZuiD) (and the claim 
follows by Lemma E. 8 ) or it is a union PUP^^ where P is a non-self-inverse SCC (and the claim 
follows by Corollary E.14). The second claim is immediate by construction. The third claim is what 
is shown by Lemma E.12, noting that for any SCC P of Zuid, we have Pos(P) = Pos(P^^). This 
concludes the proof of Proposition E. 6 . 
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F. Proofs for Section VII: Higher-Arity FDs 


In this section, we show what is needed to adapt the Acyclic Unary Universal Models Theorem (Theo¬ 
rem III.6) to produce aligned superinstances that satisfy the full set of constraints £ rather than just the 
unary subset Zu. 

F.l. Proof of the Sufficiently Envelope-Saturated Solutions Proposition 
(Proposition VII.5) 

We now prove the following result, which provides our way to construct the initial instance on which 
we apply the completion process of the previous sections: 

Proposition VII.5 (Sufficiently envelope-saturated solutions). For any K and instance Iq, we can 
build a superinstance Iq of Iq that is k-sound for CQ, and an aligned superinstance J of Iq that satis¬ 
fies ZpD and is {K\J\)-envelope-saturated. 

We define fhe nofafion |cj| := maxRgfj \R\, and also define fhe following: 

Definition F.l. The overlap OVL(F,F') between two facts F = R{a) and F' = R{b) of the same rela¬ 
tion R in an instance I is the subset O of Pos(/?) such that as = bs iffR^ € O. If\0\ >0, we say that F 
and F' overlap. 

We also define fhe following, which are fhe FDs used in fhe definition of envelopes (Definition VII.2): 

Definition F.2. Given a set ZpD offDs on a relation R and O C Pos(/?), the FD projection Zp^ o/Zpo 
to O are the FD^ ^ R'^ o/Zpo such that R^ GO and R’^ € O, plus, for every FDR^^R’^ of'LpD 
where R^ GO and /?'" ^ O, the key dependency R^ — O. 

We first note the following immediate consequence of the Dense Interpretations Theorem (Theo¬ 
rem VII.7): 

Corollary F.3. We can assume in the Dense Interpretations Theorem (Theorem VII.7) that the resulting 
instance I is such that each element occurs at exactly one position of the relation R: formally, for all 
a G dom(/), there exists exactly one RP G Pos(/?) such that a G TtRp{I). 

Proof. Create from I the instance I' whose domain is {{a,RP) \ a G dom(/),/?^ G Pos(a)} and which 
contains for every fact F = R{a) of / a fact F' = R{b) such that bp = {ap,RP) for every RP G Pos(a). 
Clearly this defines a bijection 0 from the facts of I to the facts of and for any facts F, F' of I', 
OVL(F,F') = OVL(^^^(F),0^^(F')). Thus any violation of the FDs ZpD in I' would witness one 
in I. Of course, |dom(/')| = |a| • |dom(/)|, so that, letting K' be our target constant factor between 
|dom(/')| and |/|, we must use K := K' |a| as the constant for the Dense Interpretation Theorem, so that 
|/| > ^'|a| • |dom(/)|, which implies |/'| > |dom(/')|. □ 

We also show two easy lemmas: 

Lemma F.4. Let I be an instance, Zpo be a conjunction of FD^, and F F' be two facts of I. Assume 
there is a position RP G Pos(a) such that, writing O := NDng(/?^), we have OVL(F,F') C O, and that 
{Tto{F),ito{F')} not a violation o/Zpj-,. Then {F,F'} is not a violation o/ZpD- 
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Proof. Assume by way of contradiction that F and F' violate an FD ^ ^ of which implies 

that C OVL(F,F') C O and R’’ ^ OVL(F,F'). Now, if /?'' G O, then 0 is in Zp^, so that no{F) 
and 71o{F') violate Zpp,, a contradiction. Hence, R’^ G Pos(/?)\C?, and the key dependency K :R^ ^ O 
is in ZpQ, so that 7lo{F) and no{F') must satisfy K. Thus, because R^ C OVL(F,F'), we must have 
OVL(F,F') = O, which is a contradiction because we assumed OVL(F,F') CO. □ 

Lemma F.5. For any {RP,C) G AFactCI, letting O •■= NDng(/?^), if {RP,C) is unsafe, then there is no 
position R^ that determines O in Zp^.' formally, there is no R^ £0 such that we have 7?^ —)■ 7?'" in 
Zpp, for all 7?'' G O. 

Proof Fix D = {RP,C) in AFactCI and let O be the non-dangerous positions of RP. We first show that 
if ZpD implies that O has a unary key 7?' G O in ZpD, then D is safe. Indeed, assume the existence of 
such a unary key 7?'. If there were a FD 7?^ ^ 7?'' in Zpo with R^ FO and R’^ ^ O, then, by transitivity, 
the UFD 7?^ ^ 7?'' would be in Zupd> which by Lemma C.2 implies that 7?'' is non-dangerous for RP 
because 7?' G O is non-dangerous for RP. This contradicts our assumption that 7?'' f O. 

We must now show that if O has a unary key in O according to Zp^ then O has a unary key in O 
according to ZpD. It suffices fo show thaf for any two positions R^,R^ G O, if ^ : 7?^ —)■ 7?' holds in Z^q 
then it also does in ZpD. Assuming to the contrary that there there is such a consider its derivation 
from the dependencies of Zpp. Clearly the derivation must be using one of the key dependencies 
K : R^ ^ O, which are the only dependencies in Zpp that are not in ZpD. But this means that, the first 
time we used such a dependency, we had derived a unary key dependency 7?^ —> R^ using only the FDs 
of ZpD. Considering that K was created to stand for a FD 7?^ ^ R’^ in Zpo, with 7?'' f O, we deduce that 
we can derive from ZpD that 7?^ —> 7?'', contradicting again the fact that 7?'' ^ O (because 7?'' should then 
be in NDng(7?^)). Hence, if O has a unary key in O according to Zpp then D is safe. Thus, we have 
proven the contrapositive of the desired result. □ 

We now prove Proposition VII.5. The bulk of the work is to show the following claim, for each 
unsafe class of AFactCI. The construction of global envelopes from the individual envelopes is then 
easy. 

Lemma F.6. For any unsafe class D in AFactCI and constant K, one can construct a superinstance Iq 
oflo that is k-sound for CQ, and an aligned superinstance J = (7,sim) o/7g that satisfies Zpo with an 
envelope E for D of size A' |7|. 

Proof Fix the unsafe achieved fact class D = {RP,C) and choose F =R{b) a fact of Chase(7o, Zuid) Vo 
that achieves D. Let f be obtained from Iq by applying UID chase steps on Iq to obtain a finite truncation 
of Chase(7o,ZuiD) that includes F but no child fact of F, and consider the aligned superinstance Ji = 
(7i,simi) where si mi is the identity. 

Let O ■■= NDng(7?t’), and define a |0|-ary relation7?|p; for convenience, we index its positions by 
O. Because D is unsafe, by Lemma F.5,7?|p has no unary key in Zpp. Apply the Dense Interpretations 
Theorem (Theorem VII.7) to7?|p and Zpp with the additional condition of Corollary F.3, taking K |7i | as 
the constant. We thus obtain an instance Id ofR ^q that satisfies Zpp and such that, letting N ■= |dom(7/)) |, 
we have |7/)| >NA'|7i|. Let J'd F Id be an subinstance of size N of Id such that dom(7p) = dom(7£)), 
that is, each element of dom(7/)) occurs in some fact of 7p. This can clearly be ensured by picking, for 
any element of dom(7£)), one fact of Id where it occurs, removing duplicate facts, and completing with 
other arbitrary facts of Id to have N distinct facts. Number the facts of 7p as Ff... ,Ff. 

We create N —I disjoint copies of Ji, numbered J 2 to J^. We call f = (7',sim) the disjoint union 
7i U • • • U7iv- It is clear that J' is indeed an aligned superinstance of 7 q, where 7 q is formed of the N 
disjoint copies of Iq, and 7 q is clearly a k-sound superinstance of Iq for CQ. For 1 < / < A, we call 
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Fi = R{a‘) the fact of /; that corresponds to the achiever F in Chase(/o,ruiD)- In particular, for all 
1 <i <N, we have that sim(ap = bj for all j, and aj, is the only element of Fi that also occurs in other 
facts of Ji. 

We consider the application / that maps at, for 1 < / < and G O, to This application / 

is well-defined, because the a‘j are pairwise distinct. We extend / to dom(/'), and call the extension /', 
by setting f'{a) := aif a is not in the domain of /. We call I the image of F under f. In other words, 
1 is the underlying instance of J' except that elements at positions of O in the facts Ft were identified 
so fhaf fhe projections fo O of fhe f'{Fi) are isomorphic fo fhe F-. Because occurs only in F, for all 
R^ 7 ^ RP, and RP ^ O, fhis means fhaf fhe identified elemenfs only occurred in fhe Fi in 

We now build J = (/,sim) obfained by defining sim from fhe sim,- as follows: any elemenf a nol in 
fhe domain of / is mapped fo sim;(a) for fhe one i such fhaf a G dom(/,), and any a in fhe domain of 
/ is mapped fo sim,(a') for any preimage of a' by /. All fhaf remains fo show is fhaf J is indeed an 
aligned superinsfance of /q salisfying fhe required condifions. 

We nofe fhaf if is immediate fhaf 7 is a superinsfance of /q, as fhe achiever F is nof a facf of /q, so fhaf 
dom(/o) is nof in fhe domain of /. If is clear fhaf J has N |7i | facls, because, as RP ^ O, no facls can be 
identified by f. We now claim fhaf J is an aligned superinsfance of /q, and fhaf E, defined as fhe sef of 
fhe fuples of Id, is an envelope for F and D. The facf fhaf \E\=K |7| is immediafe. 

The facf fhaf sim is afc-bounded simulation from / fo Chase(/Q,ruiD) is by induction. The case of ^ = 
0 is frivial. The induction case is frivial for all facls excepl for fhe h'{Fi), because fhe only occurred 
in 1 in fhe facls Fi, by our assumplion fhaf fhe F, have no children in fhe F and by fhe facf fhaf fhe 
exporled posilion of F is ^ O. Consider now one facf F' = R{c) of F which is fhe image by f of a Fi. 
Choose 1 < p < |F|. We show fhaf Ihere exisls a facf F" = R{d) of Chase(/Q,ruiD) such fhaf sim(cp) = 
dp and for all 1 < ^ < |F| we have (/,c^) <k-\ (Chase(/Q,ruiD ),'^^)5 which by induction hypolhesis is 
implied by sim(c^) dq. Lei be fhe preimage of Up used fo define sim(ap); by fhe condition of 
Corollary F.3, we musl have y'o = p. Consider fhe facf F" = R{d) of Chase(/Q,ruiD) corresponding fo 

i' 

Fig in I. By definition, sim(cp) = sim(aj?) = dp. Fix now 1 < ^ < |F|. Lei aj, used fo define sim(c^); 
again j'^ = q and sim(c^) is nRq{F''') for fhe facf F'" = R{e) of Chase(/Q,ruiD) corresponding fo F/^ in 
I. Bui as bolh F'” and F" are copies of fhe same achiever facf F of Chase(/o,ruiD)> we have dq Cq, 
so fhaf sim(c^) dq, whal we wanted fo show. This proves fhaf sim is indeed a ^-bounded simulation 
from J fo Chase(/o,ruiD)- 

We show fhaf J salisfies Zpo- As I satisfies Zfd, any new violation of ZpD in F relative fo 1 musl 
include some facf F = h'{F-^), and some facf F' overlapping wilh F, so necessarily F' = h'{F-^) for 
some ii by conslrucfion of F, and OVL(F,F') C O. We now use Lemma F.4 fo deduce fhaf we cannol 
have OVL(F,F') C O, so OVL(F,F') = O. By our definition of / and of fhe F/ Ibis implies fhaf F,' = Fj', 
a conlradicfion because F ^ F'. 

Thus, from fhe above, and as fhe lechnical condifions of fhe definilion of aligned superinslances are 
clearly respecled, J is indeed an aligned superinsfance of /q. 

Lasl, we check fhaf E is indeed an envelope. Indeed, if satisfies Zpj^ by conslrucfion, so fhe firsl 
Iwo conditions are respected. The Ihird condition is respecled by fhe condition of Corollary F.3, and 
because fhe f{a'j) always occur al posilion R^ in some facf of 7^, as we conslrucled 7^ such fhaf 
dom(7^) = dom(7o). The lasl condition is Irue because fhe envelope elemenfs are only used in fhe 
f{Fi), and fhe sim-images of fhe /(F,) are copies in Chase(7Q,ruiD) of fhe same achiever facf F in 
Chase(7o,ruiD)- 

Hence, J is indeed an aligned superinsfance of a k-sound 7 q fhaf satisfies Zpo and has an envelope of 
size K |7|, proving fhe desired claim. □ 
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We now prove the main result by building /q and the aligned superinstanee J = (/,sim) of /q that 
has a global envelope £. As AFactCI is finite, we build one Jd per D G AFactCI. When D is unsafe, 
we use the previous lemma. When D = {RP,C) is safe, we just take a single eopy Jd of the truneated 
ehase to aehieve the elass D, and take as the only faet of the envelope the projeetion to NDng(/?^) of 
the faet of Jd eorresponding to the aehiever of D in Chase(/o,ruiD)- As AFactCI is finite and its size is 
a eonstant, we ean ensure that \£{D)\ for all unsafe D G AFactCI is > (A'A 1) |/|, by taking sufficiently 
large K when we apply Lemma F.6 for each unsafe class. 

Let J be the disjoint union of the Jd. Each Jd is an aligned superinstanee of an {1q)d which is a 
k-sound superinstanee of /q. Hence, J is an aligned superinstanee of the union of the {I'q)d which is 
also k-sound. There are no violations of ZpD in J because there are none in any of the Jd, and the union 
is disjoint. The disjointness of domains of envelopes is because the Jd are disjoint. It is easy to see that 
J is (A'|/|)-envelope-saturated, because \£{D)\ > (A'T 1) |/| for all unsafe D G AFactCI, so the number 
of remaining facts of each envelope for an unsafe class is > A' |/| (every fact of I eliminates at most one 
fact in each envelope). Hence, the proposition is proven. 

F.2. Proof of the Dense Interpretations Theorem (Theorem VII.7) 

Remember that we want to show: 

Theorem VII.7 (Dense interpretations). For any set Zfd of FD^ over a relation R with no unary key, 
and A' G N, there exists a non-empty instance I ofR that satisfies Zpo cind has at least K |dom(/) | facts. 

Fix the relation R, and let ZpD be an arbitrary set of FDs which we assume is closed under FD 
implication. Let Eufd be the UFDs implied by Zpo; it is also closed under FD implication. Recall the 
definition of OVL (Definition F.l). We introduce a notion of safe overlaps for Fufd, which depends 
only on Fufd but (we will show) is a sufficient condition to satisfy Zpo: 

Definition F.7. We say a subset O C Pos(R) is safe for Fufd if O is empty or for every RP G Pos(R)\0, 
there exists G Pos(R) such that the unary key dependency R"^ ^ O is implied by Eupo but the UFD 
RP does not hold in Eupo. 

We say that an instance I has the safe overlaps property (/or Eufd) iffor every F f^F' of I, OVL(R,R') 
is safe. 

We now claim the following lemma, and its immediate corollary: 

Lemma F.8. If O F Pos(R) is safe for Eufd then there A no FD ^ : R^ —)■ R'' in EpD such that R^ FO 
but R" i O. 

Proof. If O is empty the claim is immediate. Otherwise, assume to the contrary the existence of such 
an FD 0. As R'" ^ O and O is safe, there is R^ G Pos(R) such that R^ ^ O holds in Eufd but R^ —> R'' 
does not hold in Eufd- Now, as R^ C O, we know that R^ —> R^ holds in Eufd> so that, by transitivity 
of EpD, : R‘t ^ R'' holds in EpD. As (j)' is a UFD, this implies it holds in Eupd> a contradiction. □ 

Corollary F.9. For any instance I, if I has the safe overlaps property for Eufd. then I satisfies Epd. 

Proof. Considering two facts F and F' in /, as OVL(R, F') is safe, we know that for any FD 0 .R^ ^R^ 
in EpD, we cannot have R^ FO but R^ O. Hence, F and F' cannot be a violation of □ 

Thus, it suffices fo show fhe following generalizafion of fhe Dense Inferprefafions Theorem: 
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Theorem F.IO. Let Rbe a relation and Tufd ci set o/ U F Di over R. Let D be the number of positions 
of the smallest key of R for Tupd-' formally, D ■= |^|, where K C Pos{R) is such that R^ —)■ RP holds 
in SuFD for all RP € Pos{R), and K has minimal cardinality among all subsets of Pos(/?) with this 
property. Let x be IfD > 1 and 1 otherwise. 

For every > 1, there exists a finite instance I ofR such that |dom(/)| is 0{N), |/| is and I 

has the safe overlaps property for Tufd- 

It is clear that this theorem implies the Dense Interpretations Theorem, because if R has no unary 
key for ZpD then D > 1 and thus x > 1, which implies that, for any K, by taking a sufficiently large N, 
we can obtain an instance I for R with N elements and KN facts that has the safe overlaps property for 
Zufd; now, by Lemma F.9, this implies that I satisfies EpD- 

We will now prove Theorem F.IO. Fix fhe relafion R and sef of UFDs Tufo- The case of D = 1 
is vacuous and can be eliminated directly (consider the instance •. • ,a,) | 1 < / < A^}). Hence, 

assume that D > 1, and let x := 

We first show the claim on a specific relafion Rq and sef of UFDs. We will fhen generalize 
fhe consfrucfion fo arbifrary relations and UFDs. Lef Tq := {1,...,D}, and consider a bijecfion v : 
{ 1 ,... ,2^} —^^(ro)\{0}. Lef/?o be a (2^ — l)-ary relafion, and fake Typp, := {R‘ R^ \ v{i) C v(j)}. 
Nofe fhaf ^upd clearly closed under implication of UFDs. Fix A^ G N, and lef us consfrucf an insfance 
Iq wifh 0{N) elemenfs and fads. 

Fixn := Lef be fhe sef of partial functions from Tq fo {1,... ,n}, and write IF = 

where F^ and F^ are respecfively fhe fofal and fhe sfricfly parfial funclions. We lake Iq lo consisl of 
one fad Ff for each / G F^, where Ff = is defined as follows: for 1 </< 2 ^, a{ :=7lro\vW- In 

particular: 

• fhe elemenl of Ff al fhe position mapped lo Tq G ^(ro)\{0}, is fhe sfricfly parfial func¬ 
tion lhal is nowhere defined; 

• a|.|, fhe elemenl of Ff al fhe position mapped lo {/} G ^(ro)\{0}, is fhe sfricfly partial function 
equal lo / excepf lhal if is undefined on i. 

Hence, dom(/o) = Fp (because 0 is nol in fhe image of v), so lhal |dom(/o)| = Lo<(<d Remem¬ 
bering lhal D is a conslanl. Ibis implies lhal |dom(/o)| is 0{n^^^), so if is 0{N) by definition of n. 
Furlher, we claim lhal |/o| = \Ft\=n^ = N^. To show Ihis, consider Iwo fads Ff and Fg, and show lhal 
Ff = Fg implies f = g, so Ihere are indeed 1^(1 differenl fads in Iq. As = 7rv-i(|i|)(Fg), 

we have f{t) = g{t) for all t G ro\{l}, and looking al ;r^,-i(| 2 })(A’/) and ;r^,-i(| 2 })(F’g) concludes (here 
we use Ihe fad lhal D > 2). Hence, Ihe cardinalities of Iq and of ils domain are suilable. 

We musl now show lhal Iq has Ihe safe overlaps properly. For Ibis we firsl make Ihe following general 
observation: 

Lemma F.ll. Let Lufd be any conjunction 0 /UFD 5 and I be an instance such that I ^ Lufd- Assume 
that, for any pair of facts F F’ of I that overlap, there exists RP G OVL(F,F') which is a unary key 
for OVL(F,F'). Then I has the safe overlaps property for Lupd- 

Proof. Consider F,F' G I and O ■■= OVL(F,F'). If F = F', Ihen O = Pos(/?), and O is clearly safe. 
Olherwise, if F 7 ^ F', lei RP G Pos(F)\f?. Lei G O be Ihe unary key of O. We know lhal R^ ^ O 
holds in Lufd^ so lo show lhal O is safe il suffices lo show lhal ^ : R^ ^ RP does nol hold in Lufd- 
However, if il did, Ihen as G O and RP f: O, F and F' would wilness a violation of (p, conlradicling 
Ihe fad lhal I satisfies Lupd- FII 

So we show lhal Iq satisfies L^pQ and lhal every non-emply overlap belween fads of Iq has a unary 
key. 
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First, to show that Iq satisfies observe that whenever (j) : ^ Rq holds in Tufo, then v(/) C 

v{j), so that, for any faet F of /q, for any 1 < f < To, whenever {7lj{F)){t) is defined, so is ( 7 r,(F))(t), 
and we have {nj{F)){t) = ( 7 r,(F))(t). Henee, letting F and F' be two faets of /q sueh that 7li{F) = 
7li{F'), we know that 7lj{F) is defined iff 7lj{F') is (as this only depends on j), and, if both are defined, 
the previous observation shows that 7lj{F) = nj{F'). Henee, F and F' eannot witness a violation of (p. 

Second, considering two facts Ff = Ro{c/) and Fg = with / 7 ^ g so that Ff ^ Fg, we show 

that if 0\/L{Ff,Fg) is non-empty then it has a unary key. Let O := {t G To | f{t) = g(f)}, and let 
X = Tq\0', we have X 7 ^ 0, because otherwise / = g, so we can define p := {X). We will show that 

0\/L{Ff,Fg) = {R' € Pos(Ro) | X C v(/)}. This implies that R^ € OVL(Ff,Fg) and that R^ is a unary 
key of OVL(Ff,Fg), because, for all 7?^ G OVL(Ff,Fg), X C v(R‘^), so that R^ —)■ R^ holds in Lufd- 

Indeed, consider R‘ such that X C v(/). Then To\v(i) C To\X, so that, because a{ = f\To\v{i) 
af =g| 7 (,\v(/)? we have aj = af by definition of 0 = Tq\X. Thus R‘ G OVL(Fy,Fg). Conversely, if 
R‘ G Oy\-{Ff,Fg), then we have = af, so by definition of O we must have 7o\v(/) C O' = To\X, 
which implies X C v(/). 

Hence, /q is a finite instance of Lufd which satisfies fhe safe overlaps property and contains 0{N) 
elements and facts. This concludes the proof of Theorem F.IO for the specific case of Rq 

and r^pQ. 

Let us now show the claim for the actual R and Lufd- Let be a key of R of minimal cardinality, 
so that I = D. Let A be any bijective labeling from K to Tq. Extend A to a function p from Pos{R) 
to ^(ro)\{0} such that, for every RP G Pos(/?) and 7?^ G K, we have A(7?*^) G ft(7?^) iff R^ = RP or 
pk j^p holds in Eufd- 

Now, create the instance 7 of 7? from Iq by creating, for every fact Fq = 7?o(«) of 7 o, a fact F = R{b) 
in 7, with bi = 1 < f < 1^1- 

We do not create duplicate facts by the same argument as before, considering the projection of 
the facts of 7 to 7?*^' 7 ^ R^^ in K, because ft(7?^*) = {A(7?^’)} and li{R^^) = {A( 7 ?*^ 2 j| (otherwise this 
contradicts the minimality of K). Hence 7, as 1q, has a suitable number of facts, and a suitable domain 
cardinality because dom(7) C dom(7o). 

Let us now show that overlaps are safe in 7. Consider two facts F,F' of 7 that overlap, and let 
O ■■= OVL(F,F'). We first claim that there exists d) F K' Q K, such that, letting X' := {A(7?^) | 7?^ G K'}, 
we have 0\/L{F,F') = {7?' G Pos(7?) | X' C ju(7?')}. Indeed, letting Ff and Fg be the facts of Iq used 
to create F and F', we previously showed the existence of 0 C X C Tq such that 0\/L{Ff,Fg) = {R‘ G 
Pos(7?o) I X C v(/)}. Our definition of F and F' from Ff and Fg makes it clear that we can satisfy the 
condition by taking K' := A^^ (X), so that X' = X. 

Consider now RP G Pos( 7?)\0. We cannot have X' C /t(7?P), otherwise RP G O. Hence, there exists 
7?*^ G K' such that A (7?*^) ^ ft (7?^)- This implies that 7?^ —> RP does not hold in Eufd- However, as 
R^ G K', we have A (7?^) G ft (7?^) for all 7?^ G O, so that R^ ^ O holds in Eufd- This proves that O = 
OVL(F,F') is safe. Hence, 7 has the safe overlaps property, which concludes the proof. 

F.3. Proof of Lemma VII.9 (Envelope-thrifty chase steps satisfy Efd) 

Lemma VII.9. For n > 0, for any n-envelope-saturated aligned superinstance J that satisfies ZpD. 
the result J' of an envelope-thrifty chase step on J is an (n — 1)-envelope-saturated superinstance that 
satisfies Zfd- 

Consider an application of an envelope-thrifty chase step: let T : 7?^ C 5^ be the UID, let O •■= 
NDng(5'^), let J = (7,sim) be the aligned superinstance of 7o, let F^ = S{b') the chase witness, let 
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D = be the faet elass, let = S{b) be the new faet to be ereated, and let t be the remaining tuple 

of £{D) used to define F^. 

We first eheek that envelope-thrifty ehase steps are well-defined in fhe sense fhaf fhe faef elass D = 
{S‘i,C) is indeed aehieved in Chase(/o,ruiD)? so if is in AFactCI. To see why, observe fhaf F^ is a faef 
of Chase(/o,ruiD) whose faef elass is {S‘^,C). Indeed, by Lemma D.2, b'^ is fhe exporfed elemenf of F^, 
and elearly b\ G Q for all S' G Pos(5'). Henee indeed D G AFactCI. 

If is fhen elear fhaf envelope-fhriffy ehase sfeps are well-defined, in fhe sense fhaf fhey are indeed 
fhriffy ehase sfeps: elemenfs reused from fhe envelopes already oeeur af fhe positions where fhey are 
used in fhe new faef F^. Furfher, fheir sim-image is fhe righf one, by definifion of an envelope. 

We firsl prove fhaf J' is sfill an aligned superinsfanee. This is shown exaefly as in Lemma V.9, 
exeepf for fhe faef fhaf J' \= Tufd whieh was speeifie fo faef-fhriffy ehase sfeps. We show insfead 
fhaf J' \= Zfd, using fhe assumption fhaf J ^ EpD- Reeall fhe definifion of OVL (Definifion F.l), and 
assume by eonfradiefion fhe exisfenee of a violafion of Zpo in J'■ The violafion musf be befween F^ 
and an existing faef F = S{c). However, beeause only fhe elemenfs af positions in O already oeeur af 
fheir position, we musf have OVL(Fn,F) C O. As TloiFn) was defined using elemenfs of dom(£’(D)), 
faking 5''' G OVL(Fn,F) C O, we have Cr = br€ %i'(£’(D)), so fhaf, by definifion of £{D), we know fhaf 
7lo{c) is a fuple of £{D). If OVL(Fn,F") C O fhen we have a eonfradiefion by applying Lemma F.4 fo 
t and 7lo{(^) in £{F>). Henee OVL(Fn,F") = O So, if D is unsafe, we have a eonfradiefion beeause F 
wifnesses fhaf t was nof a remaining fuple, so we eannof have used if fo define F^. If D is safe, fhere is 
no FD of Zfd wifh QO and /?'' ^ O, so F and F^ eannof violafe Zfd. a eonfradiefion again. 

We now prove fhaf £ is sfill a global envelope of f affer performing an envelope-fhriffy ehase sfep. 
The eondifion on fhe disjoinfness of fhe envelope domains only eoneerns £, whieh is unehanged. Henee, 
we need only show fhaf, for any D' G AFactCI, £{D') is still an envelope. Exeepf the last one, all 
eonditions of the definition of envelopes either eoneern only the envelope £{D'), whieh is unehanged, 
or they are preserved when more faets are ereated in J'. The last eondifion needs only to be eheeked 
about the new faet F^ ereated in this ehase step. 

Exeepf for the elements of F^ at positions in O, all elements of F^ did not oeeur at the positions 
where they oeeur in F^, by definition of a thrifty ehase step. So they eannof be elements of dom(£^) 
oeeurring in F^ at the one position where they oeeur in the one envelope where they oeeur, beeause we 
know that elements from any envelope already oeeur in J at that position. So we only need to eheek the 
eondifion for the br for S'" G O. But beeause the envelopes of £ are pairwise disjoint and as the br are 
all in dom(£^(D)), we only need to eheek the eondifion for £{D). Now, t witnesses that no{b) G £{D). 
Henee £ is still a global envelope of J'. 

East, to see that the resulting J' is {n — 1)-envelope-saturated, it suffiees to observe that the new faet 
Fn witnesses that, for eaeh unsafe elass D G AFactCI, the remaining tuples of £{D) for J' are those of 
£{D) for J minus at most one tuple (namely, some projeetion of F^). This eoneludes the proof. 

F.4. Proof of the Envelope-Thrifty Completion Proposition (Proposition VII.10) 

Proposition VII.10 (Envelope-thrifty eompletion). For any envelope-saturated aligned superinstance 
J of Iq that satisfies Zfd. we can obtain by envelope-thrifty chase steps an aligned superinstance J' of 
Iq, such that J' is either envelope-exhausted or satisfies Z. 

The eompletion proeess for envelope-thrifty ehase steps is defined in the same way as for faet-thrifty 
ehase steps, exeept that the elements reused at non-dangerous positions are different. By definition 


53 



of thrifty chase steps, the choice of elements reused at those positions cannot make any new UID ap¬ 
plicable, or satisfy any DID, because the elements thus reused are required to already occur at the 
positions where they are used in the new fact. Further, envelope-thrifty chase steps do not introduce 
UFD violations (in fact, they do not introduce FD violations), as follows from Lemma VII.9. Hence, 
we can indeed define the completion process for envelope-thrifty chase steps exactly like the comple¬ 
tion process for fact-thrifty chase steps, are long as the instance is envelope-saturated. Whenever an 
envelope-exhausted instance is obtained at any point of the process, we abort and set it to be the final 
insfance. 

Assuming fhaf we do nof reach any envelope-exhausted insfance, fhe facf fhaf 8 is sfill a global 
envelope of fhe resulf J' of fhe envelope-fhriffy completion process, and fhaf J' safisfies Zpo in addition 
fo ruiD> is by Lemma VIL9. 

F.5. Proof of the Envelope Blowup Lemma (Lemma VII.11) 

Lemma VII.ll (Envelope blowup). There exists B G N depending only on k and Zu such that, for any 
aligned superinstance J = (/,sim) of Iq, and global envelope 8, letting J' = (/',sim^) be the result of 
the envelope-thrifty completion process, we have |/'| < B|/|. 

We firsf observe fhaf applying a chase round fo an aligned superinsfance J = (/,sim) of Iq by any 
form of fhriffy chase steps (Definifion V.8) only increases ifs size by a mulfiplicafive consfanf. This is 
because |dom(/)| < |a| • |/|, and fhe number of facfs created per elemenf of / in a chase round isafmosf 
|Pos(a)|. 

Remember fhaf fhe envelope-completion process sfarfs by consfrucfing an ordered parfifion P = 
{Pi,.. ■ ,Pn) of ZuiD (Definition VI.l). This P does nof depend on fhe aligned superinsfance. Hence, 
as we safisfy fhe UIDs of each Pi in furn, if we can show fhaf fhe insfance size only increases by a 
mulfiplicafive consfanf for each class, fhen fhe blow-up for fhe enfire process is by a mulfiplicafive 
consfanf (obfained as fhe producf of fhe consfanfs for each Pi). 

For frivial classes, we apply one chase round by fresh envelope-fhriffy chase steps (Corollary VL4), 
so fhe blowup is by a mulfiplicafive consfanf by our inifial observafion. 

For non-frivial classes, we apply fhe Facf-Thriffy Complefion Proposition (Proposifion V.IO), mod¬ 
ified fo use envelope-fhriffy rafher fhan facf-fhriffy chase sfeps (buf fhe exacf same sfeps are applied). 
Remember fhaf fhis proposifion firsf ensures k-reversibilify by applying k -F 1 envelope-fhriffy chase 
rounds (Proposifion D.4) and fhen makes fhe resulf safisfy Zuid using fhe Guided Chase Lemma 
(Lemma D.5). Ensuring k-reversibilify only implies a blowup by a mulfiplicafive consfanf, because 
if means applying k -F 1 envelope-fhriffy chase rounds. Hence, we focus on fhe Guided Chase Femma. 

The lemma sfarfs by consfrucfing a balanced pssinsfance P using fhe Balancing Lemma (Lemma IV.9), 
and a Zu-complianf piecewise realization PI of P by fhe Realizafions Lemma (Lemma IV. 16), and fhen 
performs envelope-fhriffy chase sfeps fo safisfy Zuid following PI. We know fhaf, whenever we apply 
a envelope-fhriffy chase step fo an elemenf a in fhe guided chase, a occurs affer fhe chase step af a new 
posifion where if did nof occur before. Hence, if suffices fo show fhaf |dom(P)| is wifhin a consfanf 
facf or of |7|, because fhen we know fhaf fhe final number of facfs once fhe guided chase is over will be 
< |dom(P)| • |Pos(a)|. 

To show fhis, remember fhaf dom(P) = dom(7) UTL, where TL is fhe helper sef. Hence, we only 
need fo show fhaf | is wifhin a mulfiplicafive consfanf factor of |7|. From fhe proof of fhe Balancing 
Femma, we know fhaf is a disjoinf union of < |Pos(a)| sefs whose size is linear in |dom(7)| which 
is ifself < |a| • |7|. Hence, fhe Guided Chase Lemma only gives rise fo a blowup by a consfanf factor. 
As we justified, fhis implies fhe same abouf fhe enfire complefion process, and concludes fhe proof. 
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G. Proofs for Section VIII: Cyclic Queries 


In this section, we extend our construction of superinstances that satisfy £ and are fc-sound for ACQ, to 
superinstances that are ^-sound for CQ while still satisfying £. 

G.l. Proof of the Simple Product Lemma (Lemma VIM.5) 

Lemma VIII.5 (Simple product). Let I be a finite superinstance of Iq and G a finite {2k + \)-acyclic 
group generated by A(/). If I is k-sound for ACQ and k-instance-sound, then (/,/o) <8> G is k-sound for 

CQ. 

Fixing the superinstance 1 of Iq that is fc-sound for ACQ and ^-instance-sound, and the {2k-\- 1)- 
acyclic group G generated by A(/), consider I' := {IJo) <8> G, which is a superinstance of fi (up to our 
identification of (a,e) to a for a € dom(/o), where e is the neutral element of G). We must show that 1' 
is ^-sound for CQ. 

We start by proving a simple lemma: 

Lemma G.l. For any CQ q and instance I, if I \= q and some match h of q in I maps two different 
atoms of q to the same fact F, then there is a strictly smaller q' which entails q and has a match h’ in I 
such that, seeing matches as subinstances of I, dom(/i') C dom(/i). 

Proof Fix q, I, h, and let A=R{x) and A' =R{y) be the two atoms of q mapped to the same fact F by 
h. Necessarily A and A' are atoms for the same relation R of the fact F, and as h{A) = h{A') we know 
that h{xi) = h{yi) for all /?' G Pos(/?). 

Let dom(^) be the set of variables occurring in q. Consider the application / from dom(^ 7 ) to dom(^) 
defined by /(y,) = x, for all i, and f{x) = xif x does nol occur in A'. Observe fhaf fhis ensures fhaf 
h{x) = h{f{x)) for all x G dom(^). Lef q' = f{q) be fhe query obfained by replacing every variable 
X in ^ by /(x), and, as f{A') = /(A), removing one of fhose duplicate atoms so fhaf \q'\ < |^|. Lef 
h' =fi|dom(?')- Clearly fhe image of h' is a subsef of fhaf of h, and to see why fhis is a mafch of q' 
observe fhaf any atom f{A”) of q' is homomorphically mapped by h' to h{A") because h'{f{x)) = h{x) 
for all X so h'{f{A'')) = h{A''). 

To see why q' enfails q, observe fhaf / defines a homomorphism from q to q', so fhaf, for any mafch 
h" of q' on an insfance h" o / is a mafch of q on /'. □ 

Fix now a CQ q such fhaf \q\ < k, and assume fhaf F \= q: lef /i be a mafch of q in 7. Lef us show fhaf 
Chase(7o,ruiD) \=q- 

Lef pr be fhe applicafion from F fo 7 defined by pr : {a,g) ^ a for all a G dom(7) and g ^ G. If is 
clear fhaf pr is a homomorphism from F fo 7 fhaf maps dom(7o) x G fo dom(7o). Hence, if h involves 
some elemenf of dom(7o) x G, fhen q has a mafch in 7 involving an elemenf of Iq. Hence, as 7 is 
k-insfance-sound, Chase(7o,ruiD) |= q- We accordingly assume fhaf h does nol involve an elemenf 
of dom(7o) X G. 

If we can show fhaf Ihere is a query f of ACQ, \q'\ < k, such fhaf q' enfails q and 7 \= q', fhen, as 
7 is k-sound for ACQ, fhis suffices to conclude fhaf Chase(7o,ruiD) |= q', hence Chase(7o,£uiD) |= q 
because q' enfails q. So by way of conlradiclion we assume fhaf ^ is a query wilh a mafch ih n F 
involving no elemenf of dom(7o) x G such fhaf Ihere is no q' G ACQ, \q'\ < k, where q' enfails q and 
1\= q'', and we lake fhis counferexample query q to be of minimal size. 

In particular, fhis means we assume fhaf q is nof in ACQ, olherwise we could lake q' = q, because 
7 1= as evidenced by pxoh. So consider a Berge cycle C of q, of fhe form Ai,xi,A 2 ,X 2 ,... ,An,Xn, 
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where the A; are pairwise distinet atoms and the x,- pairwise distinet variables, and for all !</<«, 
variable x,- oeeurs at position qi of atom A,- and position of A,+i, with addition modulo n := |C|. 
We assume without loss of generality that pi ^ qi for all i. However, we do not assume that n>2: either 
n>2 and C is really a Berge eyele aeeording to our previous definition, or n = 1 and variable xi oeeurs 
in atom Ai at positions pi / qi, whieh eorresponds to the ease where there are multiple oeeurrenees of 
the same variable in an atom. 

For 1 < / < n, we write F,- = /?, («*) the image of A,- by h in by definition of I', beeause h involves 
no element of /q x G and henee no faet of /q x G, there is a faet Fj = Ri{b^) of I and gi G G sueh that 
dj = ■ ly ) for G Pos(/?,). Now, for all 1 < / < n, as h{xi) = a^. = for all 1 < / < n. 


F' 




we deduee by projeeting on the seeond eomponent that gi -1^' = -Ip'+J, so that, by eollapsing the 


equations of the eyele together, f 
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,( 1^2 


1-1 


-P2) 


X-i 




-1 


= e. 


As the girth of G under A(/) is > 2^+ 1, and this produet eontains 2n < 2k elements, we must have 
F' F',t F’ F' 

' ' for some i, or Ip' = \q\ for some i. The seeond ease is impossible beeause we assumed 


either 1^' = \p‘+\ 


F' F- 

that Pi / q^ for all 1 < / < n. Henee, neeessarily \q\ = so in partieular F! = Henee the atoms 
Ai / A,+i of q are mapped by h to the same faet F/ = F/_|_p We eonelude by Lemma G.l that there is 
a strietly smaller q' which entails q and has a match in F which is a submatch of /i; so in particular it 
involves no element of dom(/o) x G. Now, by minimality of q, q' cannot be a counterexample query. 
So there is q" G ACQ, \q"\ < k, where q" entails q' and I ^ q". Now, as q" entails q' and q' entails q, 
then q” entails q, so this contradicts the fact that q was a counterexample. 

Hence, there is no such counterexample query q, and 1' is indeed ^-sound for CQ. This concludes 
the proof. 


G.2. Proof of Lemma VIII.8 (Lifting F-bounded simulations to the quotient) 

Lemma VIII.8. Any k-bounded simulation from an instance I to an instance I' defines a k-bounded 
simulation from l/—k to !'■ 

Fix the instance I and the ^-bounded simulation sim to an instance F, and consider I" ■■= //~i. 
We show that there is a ^-bounded simulation sim' from I" to I, because sim osim' would then be a 
^-bounded simulation from I" to F, the desired claim. We define sim'(A) for all A G I" fo be a for 
any member a G A of fhe equivalence class A, and show fhaf sim' fhus defined is indeed a ^-bounded 
simulafion. 

We will show fhe sfronger resulf fhaf (F',A) <k {fa) for all A G dom(F') and for any a^A. We do 
if by proving, by induction on 0 < F < ^, fhaf (/", A) (7, a) for all A G dom(7") and a €A. The case 
^' = 0 is frivial. Hence, fix 0 < ^' < ^, assume fhaf (7",A) <k'-i (fa) for all A G dom(7") and a ^A, 
and show fhaf Ibis is also frue for k'. Choose A G dom(7"), a ^A, and show fhaf (7",A) <^/ {fa). To 
do so, consider any facl F = R{A) of 7" such fhaf Ap = A for some Rf G Pos(F). Lef F' = R{a') be 
a facl of 7 fhaf is a preimage of F by a'qe Aq for all Rti G Pos(F). We have a'pe A and 

a G A, so fhaf a'p o holds in 7. Hence, in particular we have {fa'p) <jt' {fa) because k' < k, so fhere 
exisfs a facl F" = R{a") of 7 such fhaf a'p = a and {fa'q) <k'-i {fOq) for all R‘t G Pos(F). We show 
fhaf F" is a wilness facl for F. Indeed, we have a'p = a. Lef us now choose R‘t G Pos(F) and show 
fhaf {f',Aq) <k'-i {fo'q). By induction hypolhesis, as a'q ^Aq, we have {f',Aq) <k’-i {fo'q), and as 
{fa'q) <k'-i {fo'q), by Iransilivily we have indeed (7" jAq) {fa'q). Hence, we have shown fhaf 

(7",A) <k> {fa). 

By induction, we conclude fhaf {f',A) <k (7, a) for all A G dom(7") and a G A, so fhaf fhere is indeed 
a ^-bounded simulafion from 7" lo 7, which, as we have explained, implies fhe desired claim. 
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G.3. Proof of the Cautiousness Lemma (Lemma VIII.IO) 

Lemma VIII.10 (Cautiousness). The superinstance If o/Iq constructed by the Acyclic Universal Mod¬ 
els Theorem (Theorem VII. 1) is cautious for 

Let Jf = (/f,sim) be the aligned superinstanee of /q eonstrueted by the Aeyelie Universal Models 
Theorem (Theorem VII. 1), and show that it is eautious for 

We first observe that the definition of eautiousness (Definition VIII.9) ean be generalized to apply to 
any function, and not just homomorphisms. In this case, writing F = R(a) and F' = R{a'), we define 
cautiousness as requiring, insfead of h{F) = h{F'), fhaf h(ai) = h{a\) for all 1 < / < |/?|, 

Now, lef xL,, be fhe homomorphism from Chase(/o,ruiD) to its quotient by (We distinguish it 
from which is the homomorphism from If to We first show that our construction ensures 

the following: 

Lemma G.2. If is cautious for xL^ ° sim. 

In other words, whenever two facts F =R{a) and F' = R{b) overlap in If and are not both in /q, then, 
for any position RP € Pos{R), we have sim(ap) sim(Z>p) in Chase(/o,ruiD)- 

Proof In the proof of the Acyclic Universal Models Theorem (Theorem VII. 1), If is constructed by 
first constructing an instance I using the Sufficiently Envelope-Saturated Solutions Proposition (Propo¬ 
sition VII.5), and then completing I using the Envelope-Thrifty Completion Proposition (Proposi¬ 
tion VII. 10). 

Thus, we first check that this claim holds for I. Indeed, we check it for each instance constructed in 
Lemma P.6, and the only overlapping facts in each such instance which are not in Iq are the h{Fi), which 
all map to ~j(:-equivalent sim-images. Hence, as I is the disjoint union of the instances constructed in 
Lemma P.6, we deduce that the claim holds for I. 

Second, in the proof of the Envelope-Thrifty Completion Proposition, we only perform envelope- 
thrifty chase steps. By their definition, whenever we create a new fact F^ for a fact class D, the only 
elements of F^ that can be part of an overlap between Tj, and an existing fact are envelope elements, 
appearing at the one position at which they appear in £{D). Then, by the last condition in the definition 
of envelopes (Definition VII.2), we deduce that the two overlapping facts achieve the same fact class, 
which is what we wanted to show. □ 

We now want to show that two elements in Jf having ~jt-equivalent sim images in Chase(/o,EuiD) 
must themselves be ^^.-equivalent in Jf. We do it by showing that, in fact, for any a G dom(7f), not only 
do we have {If, a) <k (C ha se(/o,EuiD), sim (a)), but we also have the reverse: (Chase(/o,EuiD),sim(a)) <k 
{If,a). In other words, intuitively, the facts of the chase must be “mirrored” in If. 

We define fhe ancestry Af of a facf F in Chase(/o,EuiD) as /q plus fhe facls of fhe pafh in fhe chase 
foresl fhaf leads fo F (if F G Iq fhen Af is jusf If}- The ancestry .4^ of a G dom(Chase(/o,EuiD)) is 
that of the fact where a was introduced. 

We now claim the following: 

LemmaG.3. For any a G dom(/f), there is a homomorphism ha from As\m{a) to If such that ha{s\m{a)) = 
a. 

Proof. We prove that this property holds on If, by first showing that it is true of the instance constructed 
in the Sufficiently Envelope-Saturated Solutions Proposition (Proposition VII.5). This is clearly the 
case because the instances created by Eemma P.6 are just truncations of the chase where some elements 
are identified. 
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Second, we show that the property is maintained hy the construction of the Envelope-Thrifty Com¬ 
pletion Proposition. We show the stronger claim that it is preserved by any thrifty chase step (Defi¬ 
nition V.8). Consider a thrifty chase step where, in a state J\ = (/i,simi) of the construction of our 
aligned superinstance, we apply a DID T : C to a fact Fa = /?(«) to create a fact Fa = S{b) and ob¬ 
tain the aligned superinstance J 2 = (/ 2 ,sim 2 ). Consider the chase witness = S{b'). By Lemma D.2, 
b'^ is the exported element between F^ and its parent in Chase(/o,EuiD)- So we know that for any i / q, 
we have Ah', = Ay^ U {Fw}. 

We need to show that the property holds for the bi that are fresh (otherwise we already know that the 
property is satisfied, as adding more facfs cannof violafe fhe properfy in J 2 on an elemenf for which if 
held in Ji). So, if none of fhe bi are fresh, fhere is nofhing fo do. Ofherwise, choose i such fhaf bi is 
fresh. By fhe definilion of Ihrifly chase steps, we have sef sim(Z7,) := b'^. Because Qp = bq is in dom(/i), 
we know fhaf fhere is a homomorphism from As\m(b^) = to such fhaf we have h{b'q) = bq. 
We extend hh^ fo fhe homomorphism /i^. from Ay. = Ay^ U {F^} to I 2 such that hh-{b[) = by by setting 
byiFv/) ■= Fa and hy{F) ■■= h{F) for any other F of Ay', we can do this because, by definition of the 
chase, F^, shares no element with the other facts of Ay (that is, with Ay^), except b'q for which our 
definition coincides with the existing image. This proves the claim. □ 

We claim that this property implies the following: 

Corollary G.4. For any a G dom(/f), there is a homomorphism ha from Chase(/o,EuiD) to f such that 
/ia(sim(a)) = a. 

Proof. Choose a G dom(/f) and let us construct ha. Let h'^ be the homomorphism from ^sim(a) to 
/f with h'a{s\m{a)) = a whose existence was proved in Lemma G.3. Now start by setting ha '■= h'a, 
and extend h'^ to be the desired homomorphism, fact by fact, using the property that f |= Luid: for 
any b G dom(Chase(/o,ruiD)) not in the domain of h'a but which was introduced in a fact F whose 
exported element c is in the current domain of h'^, let us extend h'^ to the elements of F in the following 
way: consider the parent fact F' of F and its match by h'a, let T be the UID used to create F' from 
F, and, because f \= x, there must be a suitable fact F" to extend h'a to all elements of F by setting 
h'a{F) := F"; this is consistent with the image of c previously defined in h'a. Performing fhis process 
allows us fo define fhe desired homomorphism ha. □ 

Clearly fhis resulf implies: 

Corollary G.5. For any a G dom(/f), we have (Chase(/o,ruiD),sim(fl;)) <k {h,a). 

Proof. Consider fhe resfricfion of ha fo fhe neighborhood af disfance k in fhe Gaifman graph of sim(n). 

□ 


We are now ready fo show our desired claim: 

Lemma G.6. For any a,b £ dom(/f), if s\m{a) sim((7) in Chase(/o,ruiD). then a^kb in f. 

Proof. Lix a,b G dom(/f). We have {h,a) <k (Chase(/o,ruiD))Sim(a)) because sim is a k-bounded 
simulafion; we have (Chase(/o,LuiD),sim(a)) <k (Chase(/o,ruiD),sim(Z7)) because sim(fl;) ~jtsim(Z7); 
and we have (Chase(/o,ruiD))Sim(Zj)) <k {h^b) by Corollary G.5. By fransifivify, we have (/f,a) <k 
{h,b). The ofher direcfion is symmefric, so fhe desired claim follows. □ 

We prove Lemma VIII. 8 immediafely from Lemma G.2 and Lemma G.6. 
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G.4. Proof of the Mixed Product Preservation Lemma (Lemma VIM.12) 

Lemma VIII.12 (Mixed product preservation). For any UID or FD T, ^ T and I is cautious for h, 
then {Iff} Gfx. 

Write/n,:=(/,/o)®^G. 

If T is a UID, the claim is immediate even without the cautiousness hypothesis. (In fact, the analogous 
claim could even be proven for the simple product.) Indeed, for any a G dom(/) and RP G Pos(a), if 
a G TZRpf) then {a,g) G TtRpfm) for all g & G; conversely, if a ^ tlRpf) then {a,g) ^ TlRpfm) for all 
g €G. Hence, letting T: C 5^ be a UID of Zuid, if there is {a,g) G dom(/m) such that (a,g) G 71rp{I^) 
but (a,g) ^ 7ls‘i{Im) then a G 71rp{I) but a ^ %<?(/). Hence any violation of T in implies the existence 
of a violation of z in 7, so we conclude because 7 |= T. 

Assume now that T is a FD ^ : 7?^ —)■ R’^. Assume by contradiction that there are two facts Fi = R{a) 
and F 2 = R{b) in 7m that violate 0, i.e., we have a/ = bi for all Z G L, but f br- Write a, = {vi,fi) 
and bi = {wi,gi) for all 7?‘ G Pos(7?). Consider F{ := 7?(v) and F 2 ■= R{w) the facts of 7 that are the 
images of Fi and F 2 by the homomorphism from 7m to 7 that projects on the first component. As 
7 ^ T, Fl and F 2 cannot violate <j), so as v/ = w/ for all I ^ Fwe must have Vr = Wr- Further, we have 
^r'o (^ 1 ) = ^r'o {b' 2 ) for ^riy Zo G L; hence, as 7 is cautious for h, either FfF^ G Iq or h{F[) = h{Ff). 

In the first case, by definition of the mixed product, there are /, g G G such that /,• = / and g, = g 
for all 7?' G Pos(7?). Thus, taking any Zq G L, as we have ai^ = bi^, we have //^ = g/^, so / = g, which 
implies that fr = gr- Hence, as Vr = Wr, we have {vr,fr) = {wr,gr), contradicting the fact that f b^ 

In the second case, as h is the identity on 7o and maps 1\Iq to 1'\1q, h{F[) = h{Ff) implies that either 
F[ and F 2 are both facts of 7o or they are both facts of 7\7o; but we have already excluded the former 
possibility in the first case, so we assume the latter. Let F be h{F[). By definition of the mixed product, 
there are /,g G G such that f = f ■ if^^^ and gi = g ■ if^^^ for all R‘ G Pos(7?). Picking Zq G L, from 
a/o = Z?/g, we deduce that / • 1^^^^^ = g • 1^^ which simplifies to / = g. Hence, fr = gr and we conclude 
like in the first case. 

G.5. Proof of the Mixed Product Homomorphism Lemma (Lemma VIII.13) 

Lemma VIII.13 (Mixed product homomorphism). There is a homomorphism from {I,Iq)^^G to {F , 7o) (8) 
G which is the identity on Iq x G. 

We use the homomorphism h : I ^ f to define the homomorphism h' from 7^ := (7,7o) <8^ G to 
7p := (7,7o)(8)Gby Zi'((a,g)) := (Zi(a),g) for every (a,g) G dom(7) x G. 

Consider a fact F = R{a) of 7m, with a, = {vi,gi) for all R‘ G Pos(7?). Consider its image F' = R{v) 
by the homomorphism from 7m to 7 obtained by projecting to the first component, and the image h{F') 
of F' by the homomorphism h. Aslij/^, is the identity andZi^/y^j) maps to 7i\7o, h{F') is a fact of 7o iff 
F' is. Now by definition of the simple product it is clear that 7p contains the fact h'{F) (it was created 
in 7p from h{F') for the same choice of g G G). 

The fact that h is the identity on Iq also ensures that h' is the identity on Iq x G. 
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